databricks-cli

Commit Graph

Author	SHA1	Message	Date
shreyas-goenka	85889dffb1	Move state to event for whether they support inplace progress logging (#339 ) ## Changes Adds a IsInplaceSupported() function to the event interface. Any event that now uses the progress logger has to declare whether they support in place logging ## Tests Manually	2023-04-18 14:20:35 +02:00
shreyas-goenka	93d57dd00f	Detect duplicate identifiers in bundle config (#332 ) ## Changes This PR adds checks during bundle config load and merge to error out if there are duplicate keys for resource definitions ## Tests Using unit tests and manually	2023-04-17 12:21:21 +02:00
Shreyas Goenka	eab29603fc	Revert "Log job errors using progress logger" This reverts commit `a2e20f5206`.	2023-04-15 15:19:32 +02:00
Shreyas Goenka	a2e20f5206	Log job errors using progress logger	2023-04-15 15:18:38 +02:00
shreyas-goenka	e8018a7209	Refactor output and progress into separate packages in run (#335 ) Tested manually that output and progress logging still works	2023-04-14 14:40:34 +02:00
shreyas-goenka	df0293510e	Fixes for pipeline progress logging (#330 ) ## Changes 1. Events are now printed in chronological order 2. Simplify events rendering by removing update/flow name. This makes it more consistent with the web UI too 3. Switch to server side filtering on update_id ## Tests Manually Happy run: ``` shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo 2023-04-12T20:00:22.879Z update_progress INFO "Update e1becc is INITIALIZING." 2023-04-12T20:00:22.906Z update_progress INFO "Update e1becc is SETTING_UP_TABLES." 2023-04-12T20:00:24.496Z update_progress INFO "Update e1becc is RUNNING." 2023-04-12T20:00:24.497Z flow_progress INFO "Flow 'sales_orders_raw' is QUEUED." 2023-04-12T20:00:24.586Z flow_progress INFO "Flow 'sales_orders_raw' is STARTING." 2023-04-12T20:00:24.748Z flow_progress INFO "Flow 'sales_orders_raw' is RUNNING." 2023-04-12T20:00:26.672Z flow_progress INFO "Flow 'sales_orders_raw' has COMPLETED." 2023-04-12T20:00:27.753Z update_progress INFO "Update e1becc is COMPLETED." ``` Sad run: ``` shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo 2023-04-12T20:02:07.764Z update_progress INFO "Update 04b80e is INITIALIZING." 2023-04-12T20:02:07.870Z update_progress ERROR "Update 04b80e is FAILED." Error: update failed ```	2023-04-14 12:21:44 +02:00
shreyas-goenka	3894d5796d	Add progress logging event for pipeline update URLs (#331 ) ## Changes <!-- Summary of your changes that are easy to understand --> Output now: ``` shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo The update can be found at https://e2-dogfood.staging.cloud.databricks.com/#joblist/pipelines/1cc605db-daab-4218-b38a-a63030e3eb03/updates/f92f2159-1141-47de-b1e2-1ca854b7238f 2023-04-12T20:41:19.813Z update_progress INFO "Update f92f21 is INITIALIZING." 2023-04-12T20:41:19.841Z update_progress INFO "Update f92f21 is SETTING_UP_TABLES." 2023-04-12T20:41:21.270Z update_progress INFO "Update f92f21 is RUNNING." 2023-04-12T20:41:21.271Z flow_progress INFO "Flow 'sales_orders_raw' is QUEUED." 2023-04-12T20:41:21.349Z flow_progress INFO "Flow 'sales_orders_raw' is STARTING." 2023-04-12T20:41:21.480Z flow_progress INFO "Flow 'sales_orders_raw' is RUNNING." 2023-04-12T20:41:23.493Z flow_progress INFO "Flow 'sales_orders_raw' has COMPLETED." 2023-04-12T20:41:25.484Z update_progress INFO "Update f92f21 is COMPLETED." ``` ## Tests <!-- How is this tested? -->	2023-04-14 11:11:30 +02:00
shreyas-goenka	417839021b	Add top level docs for bundle json schema (#313 ) Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com> Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>	2023-04-12 21:43:53 +02:00
Pieter Noordhuis	b388f4a0dc	Make all workspace paths string fields (#327 ) ## Changes These are unlikely to ever be DBFS paths so we can remove this level of indirection to simplify. Note: this is a breaking change. Downstream usage of these fields must be updated. ## Tests Existing tests pass.	2023-04-12 16:54:36 +02:00
Pieter Noordhuis	31ccebd62a	Store relative path to configuration file for every resource (#322 ) ## Changes If a configuration file is located in a subdirectory of the bundle root, files referenced from that configuration file should be relative to its configuration file's directory instead of the bundle root. ## Tests * New tests in `bundle/config/mutator/translate_paths_test.go`. * Existing tests under `bundle/tests` pass and are augmented to assert on paths. --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2023-04-12 16:17:13 +02:00
Miles Yucht	946906221d	Delete sync snapshots file when destroying a bundle (#323 ) ## Changes This PR changes the files.Delete() mutator to delete the sync snapshots file on destroy. This ensures that files will be uploaded when the bundle is uploaded again. ## Tests - [x] Manual test: Ran `bricks bundle destroy`, observed that the sync snapshots file was deleted.	2023-04-11 16:57:01 +02:00
Pieter Noordhuis	42d29f92c9	Pass through $HOME when invoking Terraform (#319 ) ## Changes This is useful when developing the Databricks Terraform provider where you keep a local-only build of the provider and refer to it using $HOME from `~/.terraformrc`, for example like this: ``` plugin_cache_dir = "$HOME/.terraform.d/plugin-cache" ``` ## Tests That $HOME is passed through cannot be tested as is because the `tfexec.Terraform` struct doesn't expose it through public fields or methods. What can be tested is a successful run of the initialize mutator and this is included in this commit.	2023-04-11 13:11:31 +02:00
shreyas-goenka	4871f7bc8a	Add bundle destroy command (#300 ) Adds bundle destroy capability to bricks	2023-04-06 12:54:58 +02:00
shreyas-goenka	6feaed4990	Fix host based auth conflicting with DEFAULT profile (#309 ) ## Changes Consider the following host based configuration: ``` bundle: name: job_with_file_task workspace: host: https://e2-dogfood.staging.cloud.databricks.com/ ``` If you have a DEFAULT profile, then this host is ignored. The solution proposed here is to remove the profile config loader if host is explicitly specified in the bundle config. This does come with a cost, namely that if a `DATABRICKS_CONFIG_PROFILE` env var will be ignored, which maybe goes against unified auth spec The ideal solution here is probably to make a change to go-SDK to not select DEFAULT profile if host is not empty ## Tests <!-- How is this tested? -->	2023-04-05 18:12:11 +02:00
Pieter Noordhuis	d7ac265536	Allow use of file library in pipeline (#308 ) ## Changes This requires databricks/databricks-sdk-go#359. ## Tests Tests pass and ran manual verification of deployment with files.	2023-04-05 16:29:42 +02:00
Pieter Noordhuis	4e4c0658db	Interpolate paths for job tasks that reference files (#306 ) ## Changes This change also swaps the order of mutators such that interpolation happens before path translation. This means that is is possible to use variables (e.g. `${bundle.environment}`) in notebook or file paths. ## Tests New tests pass and verified manually.	2023-04-05 16:02:17 +02:00
shreyas-goenka	7427ceba6c	Fix output panic (#311 ) ## Changes <!-- Summary of your changes that are easy to understand --> Output now: ``` { "run_page_url": "https://e2-dogfood.staging.cloud.databricks.com/?o=6051921418418893#job/6199333392110/run/1088443776202122", "task_outputs": { "input": null, "process": { "logs": "[Row(max(id)=9)]\n", "logs_truncated": false } } } ``` ## Tests <!-- How is this tested? -->	2023-04-05 15:55:24 +02:00
shreyas-goenka	8de7d32ed1	Add readonly bundle tag for internal fields (#302 ) This PR adds a bundle: "readonly" struct tag to the json schema generator. This allows us to skip generating json schema for internal readonly fields Tested using unit test	2023-04-04 12:16:07 +02:00
shreyas-goenka	ddbb17b0d9	Regenerate generated empty json schema docs (#301 ) ## Changes <!-- Summary of your changes that are easy to understand --> ## Tests <!-- How is this tested? -->	2023-04-04 12:07:30 +02:00
dependabot[bot]	57cf66d3a8	Bump github.com/databricks/databricks-sdk-go from 0.5.0 to 0.6.0 (#299 )	2023-04-03 21:33:21 +02:00
Pieter Noordhuis	f26806be8f	Set BRICKS_CLI_PATH only if it cannot be derived from $PATH (#298 ) ## Changes Related to #237. Output of `bricks auth env` now doesn't include `BRICKS_CLI_PATH` if it can be found in $PATH. ## Tests Verified manually.	2023-04-03 16:23:53 +02:00
shreyas-goenka	b4a30c641c	Add progress logging for pipeline runs (#283 ) Add progress logging for pipeline runs	2023-03-31 17:04:12 +02:00
Pieter Noordhuis	04e77102c9	Add mutators to pull and push Terraform state (#288 ) ## Changes Pull state before deploying and push state after deploying. Note: the run command was missing mutators to initialize Terraform. This is necessary if the cache directory is removed between running "deploy" and "run" (which is valid now that we synchronize state). ## Tests Manually.	2023-03-30 12:01:09 +02:00
Pieter Noordhuis	0ea0e81c8a	Ignore databricks_permissions resource when loading Terraform state (#291 ) ## Changes The databricks_permissions resource may be generated if a bundle resource includes a `permissions` block. There's no need to incorporate details from the materialization into the bundle configuration struct. ## Tests Confirmed that this fixes `bricks bundle run` when dealing with a bundle with permission configuration.	2023-03-29 21:14:52 +02:00
Pieter Noordhuis	87207bba78	Configure Terraform provider auth through env vars (#290 ) ## Changes Auth relied on setting a profile. In this change we enumerate all configuration properties and export all non-empty ones as a map with environment variables. We then pass this map to the Terraform execution wrapper. This results in Terraform using the bundle's authentication configuration. This change is needed to make #287 work. ## Tests Manually.	2023-03-29 20:46:09 +02:00
Pieter Noordhuis	cfd32c9602	Try to resolve a profile if only the host is specified (#287 ) ## Changes This improves out of the box usability where a user who already configured a `.databrickscfg` file will be able to reference the workspace host in their `bundle.yml` and it will automatically pick up the right profile. ## Tests * Newly added tests pass. * Manual testing confirms intended behavior. --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2023-03-29 20:44:19 +02:00
Pieter Noordhuis	8af934bbbb	Function to find the Git repository containing a bundle (#289 ) ## Changes Useful functions from #277. ## Tests Tests pass.	2023-03-29 16:36:35 +02:00
shreyas-goenka	8fd3dccca9	Add progress logs for job runs (#276 )	2023-03-29 14:58:09 +02:00
Pieter Noordhuis	edd8630f71	Log mutator phase at info level (#272 )	2023-03-22 17:02:22 +01:00
Pieter Noordhuis	123a5e15e9	Acquire lock prior to deploy (#270 ) Add configuration: ``` bundle: lock: enabled: true force: false ``` The force field can be set by passing the `--force` argument to `bricks bundle deploy`. Doing so means the deployment lock is acquired even if it is currently held. This should only be used in exceptional cases (e.g. a previous deployment has failed to release the lock).	2023-03-22 16:37:26 +01:00
Pieter Noordhuis	6850caf2a2	Include mutator name in logging context (#271 )	2023-03-22 15:54:10 +01:00
shreyas-goenka	bfa20cdec9	Add json tags to output fields (#269 ) output now: ``` { "run_page_url": "https://adb-309687753508875.15.azuredatabricks.net/?o=309687753508875#job/1077573342009637/run/19099317", "task_outputs": { "my_notebook_task": { "result": "computed results from notebook." } } }% ```	2023-03-21 18:38:11 +01:00
shreyas-goenka	75d516939b	Error out if notebook file does not exist locally (#261 ) Adds check for whether file exists locally case 1: local (relative) file does not exist ``` foo: name: "[job-output] test-job by shreyas" tasks: - task_key: my_notebook_task existing_cluster_id: * notebook_task: notebook_path: "./doesnotexist" ``` output: ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy Error: notebook ./doesnotexist not found. Error: open /Users/shreyas.goenka/projects/job-output/doesnotexist: no such file or directory ``` case 2: remote (absolute) file does not exist ``` foo: name: "[job-output] test-job by shreyas" tasks: - task_key: my_notebook_task existing_cluster_id: * notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/doesnotexist" ``` output: ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle deploy shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo Error: failed to reach TERMINATED or SKIPPED, got INTERNAL_ERROR: Task my_notebook_task failed with message: Notebook not found: /Users/shreyas.goenka@databricks.com/doesnotexist. This caused all downstream tasks to get skipped. ``` case 3: remote exists Successful deploy and run	2023-03-21 18:13:16 +01:00
shreyas-goenka	047a189c1e	Add job run output logging (#260 ) This PR adds output logging for job runs Tested using unit tests and manually	2023-03-21 16:25:18 +01:00
shreyas-goenka	4ac2e33def	Throw error when job run is skipped due to max_concurrent_runs (#257 ) Tested manually: Before we did not have get any errors/logs and silently failed in this case ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo Error: run skipped: Skipping this run because the limit of 1 maximum concurrent runs has been reached. ```	2023-03-21 13:17:15 +01:00
Pieter Noordhuis	66ca9ec266	Add permissions block to each resource (#264 ) Example: ```yaml resources: jobs: my_job: name: "[${bundle.environment}] My job" permissions: - level: CAN_VIEW group_name: users ```	2023-03-21 10:58:16 +01:00
Pieter Noordhuis	58563b1ea9	Add resources for mlflow models and experiments (#263 ) Manually confirmed that both can be deployed.	2023-03-20 21:28:43 +01:00
Pieter Noordhuis	077ab8b864	Update Terraform provider schema structs (#265 ) Generated from provider version 1.13.0.	2023-03-20 17:22:55 +01:00
Pieter Noordhuis	ad666ff796	Use new logger throughout codebase (#256 )	2023-03-17 15:17:31 +01:00
shreyas-goenka	7faa9dea9b	Use tracker for reference loop tracking (#252 ) We incorrectly relied on map key iteration order to print debug trace. This PR switches over to using the tracker struct to allow more reliable json schema reference loop detection and logging This also fixes the failing TestSelfReferenceLoopErrors and TestCrossReferenceLoopErrors tests	2023-03-16 12:57:57 +01:00
shreyas-goenka	207777849b	Log latest error event on pipeline run fail (#239 ) DAB config used to test this: bundle.yml ``` workspace: host: <deco-azure-prod> bundle: name: deco-538 resources: pipelines: foo: name: "[${bundle.name}] log pipeline errors" libraries: - notebook: path: ./myNb.py development: true ``` myNb.py ``` # Databricks notebook source print(1/0) ``` Before: ``` 2023/03/09 01:28:44 [INFO] [pipelines.foo] Update available at * 2023/03/09 01:28:44 [INFO] [pipelines.foo] Update status: CREATED 2023/03/09 01:28:46 [INFO] [pipelines.foo] Update status: INITIALIZING 2023/03/09 01:28:52 [INFO] [pipelines.foo] Update status: FAILED 2023/03/09 01:28:52 [INFO] [pipelines.foo] Update has failed! Error: update failed ``` Now: ``` 2023/03/09 01:29:31 [INFO] [pipelines.foo] Update available at * 2023/03/09 01:29:31 [INFO] [pipelines.foo] Update status: CREATED 2023/03/09 01:29:33 [INFO] [pipelines.foo] Update status: INITIALIZING 2023/03/09 01:29:40 [INFO] [pipelines.foo] Update status: FAILED 2023/03/09 01:29:40 [INFO] [pipelines.foo] Update has failed! 2023/03/09 01:29:40 [ERROR] [pipelines.foo] Update 27bc77 is FAILED. trace for most recent exception: Failed to execute python command for notebook '/Users/shreyas.goenka@databricks.com/.bundle/deco-538/default/files/myNb' with id RunnableCommandId(9070319781942164851) and error AnsiResult(--------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) <command--1> in <cell line: 1>() ----> 1 print(1/0) ZeroDivisionError: division by zero,Map(),Map(),List(),List(),Map()) Error: update failed ```	2023-03-16 12:23:46 +01:00
shreyas-goenka	c40e428469	skip flaky cross reference test (#251 )	2023-03-15 17:09:52 +01:00
shreyas-goenka	92d1dd7e48	skip failing test for now (#249 )	2023-03-15 16:57:41 +01:00
shreyas-goenka	18a216bf97	Add openapi descriptions to bundle resources (#229 ) This PR: 1. Adds autogeneration of descriptions for `resources` field 2. Autogenerates empty descriptions for any properties in DABs 3. Defines SOPs for how to refresh these descriptions 4. Adds command to generate this documentation 5. Adds Automatically copy any descriptions over to `environments` property Basically it provides a framework for adding descriptions to the generated JSON schema Tested manually and using unit tests	2023-03-15 03:18:51 +01:00
Fabian Jakobs	f0c35a2b27	Initialize BRICKS_CLI_PATH and increase default OAuth timeout (#237 ) related to https://github.com/databricks/databricks-sdk-go/pull/330	2023-03-08 16:14:24 +01:00
shreyas-goenka	f93b541b63	Show detailed error logs for jobs (#209 ) PR for how to render errors on console for jobs. Here is the bundle used for the logs below: ``` bundle: name: deco-438 workspace: host: https://adb-309687753508875.15.azuredatabricks.net resources: jobs: foo: name: "[${bundle.name}][${bundle.environment}] a test notebook" tasks: - task_key: alpha existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] invalid notebook" - task_key: beta existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/does-not-exist" - task_key: gamma existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] valid notebook" ``` And this is a screenshot of the logs from the console: <img width="1057" alt="Screenshot 2023-02-17 at 7 12 29 PM" src="https://user-images.githubusercontent.com/88374338/219744768-ab7f1e79-db8f-466a-ad6d-f2b6f85ed17c.png"> Here are the logs when only tasks gamma is executed (successfully): <img width="1059" alt="Screenshot 2023-02-17 at 7 13 04 PM" src="https://user-images.githubusercontent.com/88374338/219744992-011d8b91-ec1d-44f0-a849-83c81816dd9f.png"> TODO: Investigate more possible job errors, and make sure state for them is handled in a robust way here	2023-02-20 23:40:14 +01:00
Pieter Noordhuis	dd95668474	Complete positional argument to bundle run (#220 ) Command completion can be configured through `bricks completion`.	2023-02-20 21:55:06 +01:00
Pieter Noordhuis	9912ee1f92	Materialize glob expansion in configuration struct (#217 ) This is needed to figure out which files should adhere to the schema.	2023-02-20 21:01:28 +01:00
Pieter Noordhuis	a0ed02281d	Execute file synchronization on deploy (#211 ) 1. Perform file synchronization on deploy 2. Update notebook file path translation logic to point to the synchronization target rather than treating the notebook as an artifact and uploading it separately.	2023-02-20 19:42:55 +01:00
Pieter Noordhuis	414ea4f891	Bump databricks-sdk-go to 0.3.2 (#215 )	2023-02-20 16:00:20 +01:00

1 2

94 Commits