databricks-cli

Commit Graph

Author	SHA1	Message	Date
Pieter Noordhuis	ed84a33b0a	Reuse resource resolution code for the run command (#1858 ) ## Changes As of #1846 we have a generalized package for doing resource lookups and completion. This change updates the run command to use this instead of more specific code under `bundle/run`. ## Tests * Unit tests pass * Manually confirmed that completion and prompting works	2024-10-24 13:24:30 +00:00
shreyas-goenka	3bab21e72e	Fix race condition when restarting continuous jobs (#1849 ) ## Changes We don't need to cancel existing runs when the job is continuous and unpaused. The `/jobs/run-now` command will cancel the existing run and trigger a new one automatically. Cancelling the job manually can cause a race condition where both the manual trigger from the CLI and the continuous trigger from the job configuration happens at the same time. This PR prevents that from happening. ## Tests Unit tests and manually	2024-10-22 14:59:17 +00:00
Andrew Nester	54799a1918	Upgrade Go SDK to 0.44.0 (#1679 ) ## Changes Upgrade Go SDK to 0.44.0 --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-08-15 13:23:07 +00:00
Pieter Noordhuis	3108883a8f	Processing and completion of positional args to bundle run (#1120 ) ## Changes With this change, both job parameters and task parameters can be specified as positional arguments to bundle run. How the positional arguments are interpreted depends on the configuration of the job. ### Examples: For a job that has job parameters configured a user can specify: ``` databricks bundle run my_job -- --param1=value1 --param2=value2 ``` And the run is kicked off with job parameters set to: ```json { "param1": "value1", "param2": "value2" } ``` Similarly, for a job that doesn't use job parameters and only has `notebook_task` tasks, a user can specify: ``` databricks bundle run my_notebook_job -- --param1=value1 --param2=value2 ``` And the run is kicked off with task level `notebook_params` configured as: ```json { "param1": "value1", "param2": "value2" } ``` For a job that doesn't doesn't use job parameters and only has either `spark_python_task` or `python_wheel_task` tasks, a user can specify: ``` databricks bundle run my_python_file_job -- --flag=value other arguments ``` And the run is kicked off with task level `python_params` configured as: ```json [ "--flag=value", "other", "arguments" ] ``` The same is applied to jobs with only `spark_jar_task` or `spark_submit_task` tasks. ## Tests Unit tests. Tested the completions manually.	2024-04-22 11:50:13 +00:00
Pieter Noordhuis	04827688fb	Add `--validate-only` flag to run validate-only pipeline update (#1251 ) ## Changes This flag starts a "validation-only" update. ## Tests Unit and manual confirmation it does what it should.	2024-03-04 08:38:32 +00:00
Andrew Nester	bc30c9ed4a	Added `--restart` flag for `bundle run` command (#1191 ) ## Changes Added `--restart` flag for `bundle run` command When running with this flag, `bundle run` will cancel all existing runs before starting a new one ## Tests Manually	2024-02-09 14:33:14 +00:00
Andrew Nester	de363faa53	Make sure grouped flags are added to the command flag set (#1180 ) ## Changes Make sure grouped flags are added to the command flag set ## Tests Added regression tests	2024-02-07 10:27:13 +00:00
Andrew Nester	2bbb644749	Group bundle run flags by job and pipeline types (#1174 ) ## Changes Group bundle run flags by job and pipeline types ## Tests ``` Run a resource (e.g. a job or a pipeline) Usage: databricks bundle run [flags] KEY Job Flags: --dbt-commands strings A list of commands to execute for jobs with DBT tasks. --jar-params strings A list of parameters for jobs with Spark JAR tasks. --notebook-params stringToString A map from keys to values for jobs with notebook tasks. (default []) --params stringToString comma separated k=v pairs for job parameters (default []) --pipeline-params stringToString A map from keys to values for jobs with pipeline tasks. (default []) --python-named-params stringToString A map from keys to values for jobs with Python wheel tasks. (default []) --python-params strings A list of parameters for jobs with Python tasks. --spark-submit-params strings A list of parameters for jobs with Spark submit tasks. --sql-params stringToString A map from keys to values for jobs with SQL tasks. (default []) Pipeline Flags: --full-refresh strings List of tables to reset and recompute. --full-refresh-all Perform a full graph reset and recompute. --refresh strings List of tables to update. --refresh-all Perform a full graph update. Flags: -h, --help help for run --no-wait Don't wait for the run to complete. Global Flags: --debug enable debug logging -o, --output type output type: text or json (default text) -p, --profile string ~/.databrickscfg profile -t, --target string bundle target to use (if applicable) --var strings set values for variables defined in bundle config. Example: --var="foo=bar" ```	2024-02-06 14:51:02 +00:00
Pieter Noordhuis	06b50670e1	Support passing job parameters to bundle run (#1115 ) ## Changes This change adds support for job parameters. If job parameters are specified for a job that doesn't define job parameters it returns an error. Conversely, if task parameters are specified for a job that defines job parameters, it also returns an error. This change moves the options structs and their functions to separate files and backfills test coverage for them. Job parameters can now be specified with `--params foo=bar,bar=qux`. ## Tests Unit tests and manual integration testing.	2024-01-15 07:42:36 +00:00
Andrew Nester	83d50001fc	Pass parameters to task when run with `--python-params` and `python_wheel_wrapper` is true (#1037 ) ## Changes It makes the behaviour consistent with or without `python_wheel_wrapper` on when job is run with `--python-params` flag. In `python_wheel_wrapper` mode it converts dynamic `python_params` in a dynamic specially named `notebook_param` and the wrapper reads them with `dbutils` and pass to `sys.argv` Fixes #1000 ## Tests Added an integration test. Integration tests pass.	2023-12-01 10:35:20 +00:00
Pieter Noordhuis	3a812a61e5	Increase timeout waiting for job run to 1 day (#786 ) ## Changes It's not uncommon for job runs to take more than 2 hours. On the client side, we should not stop waiting for a job to complete if it is intentionally running for a long time. If a job isn't supposed to run this long, the user can specify a run timeout in the job specification itself. ## Tests n/a	2023-09-19 19:54:24 +00:00
Pieter Noordhuis	a2775f836f	Use interactive prompt to select resource to run if not specified (#762 ) ## Changes Display an interactive prompt with a list of resources to run if one isn't specified and the command is run interactively. ## Tests Manually confirmed: * The new prompt works * Shell completion still works * Specifying a key argument still works	2023-09-11 18:03:12 +00:00
Andrew Nester	e08f419ef6	Do not include empty output in job run output (#749 ) ## Changes Do not include empty output in job run output ## Tests Running a job from CLI, the result: ``` andrew.nester@HFW9Y94129 wheel % databricks bundle run some_other_job --output json Run URL: https://***/?o=6051921418418893#job/780620378804085/run/386695528477456 2023-09-08 11:33:24 "[default] My Wheel Job" TERMINATED SUCCESS { "task_outputs": [ { "TaskKey": "TestTask", "Output": { "result": "Hello from my func\nGot arguments v2:\n['python']\n" }, "EndTime": 1694165597474 } ] ```	2023-09-08 09:52:45 +00:00
Lennart Kats (databricks)	57e75d3e22	Add development runs (#522 ) This implements the "development run" functionality that we desire for DABs in the workspace / IDE. ## bundle.yml changes In bundle.yml, there should be a "dev" environment that is marked as `mode: debug`: ``` environments: dev: default: true mode: development # future accepted values might include pull_request, production ``` Setting `mode` to `development` indicates that this environment is used just for running things for development. This results in several changes to deployed assets: * All assets will get '[dev]' in their name and will get a 'dev' tag * All assets will be hidden from the list of assets (future work; e.g. for jobs we would have a special job_type that hides it from the list) * All deployed assets will be ephemeral (future work, we need some form of garbage collection) * Pipelines will be marked as 'development: true' * Jobs can run on development compute through the `--compute` parameter in the CLI * Jobs get their schedule / triggers paused * Jobs get concurrent runs (it's really annoying if your runs get skipped because the last run was still in progress) Other accepted values for `mode` are `default` (which does nothing) and `pull-request` (which is reserved for future use). ## CLI changes To run a single job called "shark_sighting" on existing compute, use the following commands: ``` $ databricks bundle deploy --compute 0617-201942-9yd9g8ix $ databricks bundle run shark_sighting ``` which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that `--compute` is not accepted in production environments, so we show an error if `mode: development` is not used. The `run --deploy` command offers a convenient shorthand for the common combination of deploying & running: ``` $ export DATABRICKS_COMPUTE=0617-201942-9yd9g8ix $ bundle run --deploy shark_sightings ``` The `--deploy` addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute. One more thing I added is`run --no-wait` can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves). ``` $ bundle run --deploy shark_sightings --no-wait ```	2023-07-12 08:51:54 +02:00
Serge Smertin	2aa61a7c1b	Update with the latest Go SDK (#457 ) ## Changes - removed deprecated methods - regenerated with the latest OpenAPI spec - picked up the latest go SDK version ## Tests `make test`	2023-06-12 14:23:21 +02:00
Pieter Noordhuis	98ebb78c9b	Rename bricks -> databricks (#389 ) ## Changes Rename all instances of "bricks" to "databricks". ## Tests * Confirmed the goreleaser build works, uses the correct new binary name, and produces the right archives. * Help output is confirmed to be correct. * Output of `git grep -w bricks` is minimal with a couple changes remaining for after the repository rename.	2023-05-16 18:35:39 +02:00
Andrew Nester	1916bc9d68	Fixed printing the tasks in job output in DAG execution order (#377 ) Fixes #259 ## Changes Sort task output in an execution order based on task end time ## Tests Added `TestTaskJobOutputOrderToString` unit test.	2023-05-08 16:35:47 +02:00
Serge Smertin	9581187c9e	Update to Go SDK v0.8.0 (#351 ) ## Changes - Update to Go SDK v0.8.0 - Fix all breaking changes ## Tests - make test	2023-04-21 10:30:20 +02:00
shreyas-goenka	089bebc92f	Do not print exceptions for non ERROR events (#347 ) ## Changes Adds a check to not print exceptions trace for dlt events with a level < ERROR ## Tests Unit test	2023-04-19 22:11:05 +02:00
shreyas-goenka	d0872b45e2	Log pipeline update errors using progress logger (#338 ) ## Changes Logs error message for all exceptions ## Tests Manually and using unit tests	2023-04-18 15:00:34 +02:00
shreyas-goenka	59eee11989	Log job errors using progress logger (#337 ) ## Changes This PR logs job errors using the progress logger ## Tests Manually	2023-04-18 14:58:20 +02:00
shreyas-goenka	1a7b3eef18	Log job run url using progress logger (#336 ) ## Changes Logs the job url using the progress logger ## Tests Manually	2023-04-18 14:40:45 +02:00
shreyas-goenka	85889dffb1	Move state to event for whether they support inplace progress logging (#339 ) ## Changes Adds a IsInplaceSupported() function to the event interface. Any event that now uses the progress logger has to declare whether they support in place logging ## Tests Manually	2023-04-18 14:20:35 +02:00
Shreyas Goenka	eab29603fc	Revert "Log job errors using progress logger" This reverts commit `a2e20f5206`.	2023-04-15 15:19:32 +02:00
Shreyas Goenka	a2e20f5206	Log job errors using progress logger	2023-04-15 15:18:38 +02:00
shreyas-goenka	e8018a7209	Refactor output and progress into separate packages in run (#335 ) Tested manually that output and progress logging still works	2023-04-14 14:40:34 +02:00
shreyas-goenka	df0293510e	Fixes for pipeline progress logging (#330 ) ## Changes 1. Events are now printed in chronological order 2. Simplify events rendering by removing update/flow name. This makes it more consistent with the web UI too 3. Switch to server side filtering on update_id ## Tests Manually Happy run: ``` shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo 2023-04-12T20:00:22.879Z update_progress INFO "Update e1becc is INITIALIZING." 2023-04-12T20:00:22.906Z update_progress INFO "Update e1becc is SETTING_UP_TABLES." 2023-04-12T20:00:24.496Z update_progress INFO "Update e1becc is RUNNING." 2023-04-12T20:00:24.497Z flow_progress INFO "Flow 'sales_orders_raw' is QUEUED." 2023-04-12T20:00:24.586Z flow_progress INFO "Flow 'sales_orders_raw' is STARTING." 2023-04-12T20:00:24.748Z flow_progress INFO "Flow 'sales_orders_raw' is RUNNING." 2023-04-12T20:00:26.672Z flow_progress INFO "Flow 'sales_orders_raw' has COMPLETED." 2023-04-12T20:00:27.753Z update_progress INFO "Update e1becc is COMPLETED." ``` Sad run: ``` shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo 2023-04-12T20:02:07.764Z update_progress INFO "Update 04b80e is INITIALIZING." 2023-04-12T20:02:07.870Z update_progress ERROR "Update 04b80e is FAILED." Error: update failed ```	2023-04-14 12:21:44 +02:00
shreyas-goenka	3894d5796d	Add progress logging event for pipeline update URLs (#331 ) ## Changes <!-- Summary of your changes that are easy to understand --> Output now: ``` shreyas.goenka@THW32HFW6T pipeline-progress % bricks bundle run foo The update can be found at https://e2-dogfood.staging.cloud.databricks.com/#joblist/pipelines/1cc605db-daab-4218-b38a-a63030e3eb03/updates/f92f2159-1141-47de-b1e2-1ca854b7238f 2023-04-12T20:41:19.813Z update_progress INFO "Update f92f21 is INITIALIZING." 2023-04-12T20:41:19.841Z update_progress INFO "Update f92f21 is SETTING_UP_TABLES." 2023-04-12T20:41:21.270Z update_progress INFO "Update f92f21 is RUNNING." 2023-04-12T20:41:21.271Z flow_progress INFO "Flow 'sales_orders_raw' is QUEUED." 2023-04-12T20:41:21.349Z flow_progress INFO "Flow 'sales_orders_raw' is STARTING." 2023-04-12T20:41:21.480Z flow_progress INFO "Flow 'sales_orders_raw' is RUNNING." 2023-04-12T20:41:23.493Z flow_progress INFO "Flow 'sales_orders_raw' has COMPLETED." 2023-04-12T20:41:25.484Z update_progress INFO "Update f92f21 is COMPLETED." ``` ## Tests <!-- How is this tested? -->	2023-04-14 11:11:30 +02:00
shreyas-goenka	4871f7bc8a	Add bundle destroy command (#300 ) Adds bundle destroy capability to bricks	2023-04-06 12:54:58 +02:00
shreyas-goenka	7427ceba6c	Fix output panic (#311 ) ## Changes <!-- Summary of your changes that are easy to understand --> Output now: ``` { "run_page_url": "https://e2-dogfood.staging.cloud.databricks.com/?o=6051921418418893#job/6199333392110/run/1088443776202122", "task_outputs": { "input": null, "process": { "logs": "[Row(max(id)=9)]\n", "logs_truncated": false } } } ``` ## Tests <!-- How is this tested? -->	2023-04-05 15:55:24 +02:00
shreyas-goenka	b4a30c641c	Add progress logging for pipeline runs (#283 ) Add progress logging for pipeline runs	2023-03-31 17:04:12 +02:00
shreyas-goenka	8fd3dccca9	Add progress logs for job runs (#276 )	2023-03-29 14:58:09 +02:00
shreyas-goenka	bfa20cdec9	Add json tags to output fields (#269 ) output now: ``` { "run_page_url": "https://adb-309687753508875.15.azuredatabricks.net/?o=309687753508875#job/1077573342009637/run/19099317", "task_outputs": { "my_notebook_task": { "result": "computed results from notebook." } } }% ```	2023-03-21 18:38:11 +01:00
shreyas-goenka	047a189c1e	Add job run output logging (#260 ) This PR adds output logging for job runs Tested using unit tests and manually	2023-03-21 16:25:18 +01:00
shreyas-goenka	4ac2e33def	Throw error when job run is skipped due to max_concurrent_runs (#257 ) Tested manually: Before we did not have get any errors/logs and silently failed in this case ``` shreyas.goenka@THW32HFW6T job-output % bricks bundle run foo Error: run skipped: Skipping this run because the limit of 1 maximum concurrent runs has been reached. ```	2023-03-21 13:17:15 +01:00
Pieter Noordhuis	ad666ff796	Use new logger throughout codebase (#256 )	2023-03-17 15:17:31 +01:00
shreyas-goenka	207777849b	Log latest error event on pipeline run fail (#239 ) DAB config used to test this: bundle.yml ``` workspace: host: <deco-azure-prod> bundle: name: deco-538 resources: pipelines: foo: name: "[${bundle.name}] log pipeline errors" libraries: - notebook: path: ./myNb.py development: true ``` myNb.py ``` # Databricks notebook source print(1/0) ``` Before: ``` 2023/03/09 01:28:44 [INFO] [pipelines.foo] Update available at * 2023/03/09 01:28:44 [INFO] [pipelines.foo] Update status: CREATED 2023/03/09 01:28:46 [INFO] [pipelines.foo] Update status: INITIALIZING 2023/03/09 01:28:52 [INFO] [pipelines.foo] Update status: FAILED 2023/03/09 01:28:52 [INFO] [pipelines.foo] Update has failed! Error: update failed ``` Now: ``` 2023/03/09 01:29:31 [INFO] [pipelines.foo] Update available at * 2023/03/09 01:29:31 [INFO] [pipelines.foo] Update status: CREATED 2023/03/09 01:29:33 [INFO] [pipelines.foo] Update status: INITIALIZING 2023/03/09 01:29:40 [INFO] [pipelines.foo] Update status: FAILED 2023/03/09 01:29:40 [INFO] [pipelines.foo] Update has failed! 2023/03/09 01:29:40 [ERROR] [pipelines.foo] Update 27bc77 is FAILED. trace for most recent exception: Failed to execute python command for notebook '/Users/shreyas.goenka@databricks.com/.bundle/deco-538/default/files/myNb' with id RunnableCommandId(9070319781942164851) and error AnsiResult(--------------------------------------------------------------------------- ZeroDivisionError Traceback (most recent call last) <command--1> in <cell line: 1>() ----> 1 print(1/0) ZeroDivisionError: division by zero,Map(),Map(),List(),List(),Map()) Error: update failed ```	2023-03-16 12:23:46 +01:00
shreyas-goenka	f93b541b63	Show detailed error logs for jobs (#209 ) PR for how to render errors on console for jobs. Here is the bundle used for the logs below: ``` bundle: name: deco-438 workspace: host: https://adb-309687753508875.15.azuredatabricks.net resources: jobs: foo: name: "[${bundle.name}][${bundle.environment}] a test notebook" tasks: - task_key: alpha existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] invalid notebook" - task_key: beta existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/does-not-exist" - task_key: gamma existing_cluster_id: 1109-115254-ox7poobk notebook_task: notebook_path: "/Users/shreyas.goenka@databricks.com/[deco-438] valid notebook" ``` And this is a screenshot of the logs from the console: <img width="1057" alt="Screenshot 2023-02-17 at 7 12 29 PM" src="https://user-images.githubusercontent.com/88374338/219744768-ab7f1e79-db8f-466a-ad6d-f2b6f85ed17c.png"> Here are the logs when only tasks gamma is executed (successfully): <img width="1059" alt="Screenshot 2023-02-17 at 7 13 04 PM" src="https://user-images.githubusercontent.com/88374338/219744992-011d8b91-ec1d-44f0-a849-83c81816dd9f.png"> TODO: Investigate more possible job errors, and make sure state for them is handled in a robust way here	2023-02-20 23:40:14 +01:00
Pieter Noordhuis	dd95668474	Complete positional argument to bundle run (#220 ) Command completion can be configured through `bricks completion`.	2023-02-20 21:55:06 +01:00
Pieter Noordhuis	3582037be6	Add nil check for retries.Info.Info (#166 )	2023-01-12 18:58:36 +01:00
Pieter Noordhuis	8f4461904b	Define flags for running jobs and pipelines (#146 )	2022-12-23 15:17:16 +01:00
Pieter Noordhuis	49aa858b89	Run command must always take a single argument (#156 )	2022-12-22 16:19:38 +01:00
Pieter Noordhuis	7f83463ca3	Bump SDK to latest (#151 )	2022-12-22 09:46:17 +01:00
Pieter Noordhuis	b111416fe5	Add `bricks bundle run` command (#134 )	2022-12-15 15:12:47 +01:00

44 Commits