databricks-cli

Commit Graph

Author	SHA1	Message	Date
Andrew Nester	48ff18e5fc	Upload local libraries even if they don't have artifact defined (#1664 ) ## Changes Previously for all the libraries referenced in configuration DABs made sure that there is corresponding artifact section. But this is not really necessary and flexible, because local libraries might be built outside of dabs context. It also created difficult to follow logic in code where we back referenced libraries to artifacts which was difficult to fllow This PR does 3 things: 1. Allows all local libraries referenced in DABs config to be uploaded to remote 2. Simplifies upload and glob references expand logic by doing this in single place 3. Speed things up by uploading library only once and doing this in parallel ## Tests Added unit + integration tests + made sure that change is backward compatible (no changes in existing tests) --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-08-14 09:03:44 +00:00
Andrew Nester	39fc86e83b	Split artifact cleanup into prepare step before build (#1618 ) ## Changes Now prepare stage which does cleanup is execute once before every build, so artifacts built into the same folder are correctly kept Fixes workaround 2 from this issue #1602 ## Tests Added unit test	2024-07-24 09:13:49 +00:00
Andrew Nester	434bcbb018	Allow artifacts (JARs, wheels) to be uploaded to UC Volumes (#1591 ) ## Changes This change allows to specify UC volumes path as an artifact paths so all artifacts (JARs, wheels) are uploaded to UC Volumes. Example configuration is here: ``` bundle: name: jar-bundle workspace: host: https://foo.com artifact_path: /Volumes/main/default/foobar artifacts: my_java_code: path: ./sample-java build: "javac PrintArgs.java && jar cvfm PrintArgs.jar META-INF/MANIFEST.MF PrintArgs.class" files: - source: ./sample-java/PrintArgs.jar resources: jobs: jar_job: name: "Test Spark Jar Job" tasks: - task_key: TestSparkJarTask new_cluster: num_workers: 1 spark_version: "14.3.x-scala2.12" node_type_id: "i3.xlarge" spark_jar_task: main_class_name: PrintArgs libraries: - jar: ./sample-java/PrintArgs.jar ``` ## Tests Manually + added E2E test for Java jobs E2E test is temporarily skipped until auth related issues for UC for tests are resolved	2024-07-16 08:57:04 +00:00
Andrew Nester	3d8446bbdb	Rewrite local path for libraries in foreach tasks (#1569 ) ## Changes Now local library path in `libraries` section of foreach each tasks are correctly replaced with remote path for this library when it's uploaded to Databricks ## Tests Added unit test	2024-07-05 10:58:28 +00:00
Andrew Nester	3f8036f2df	Fixed seg fault when specifying environment key for tasks (#1443 ) ## Changes Fixed seg fault when specifying environment key for tasks	2024-05-21 10:00:04 +00:00
Andrew Nester	1872aa12b3	Added support for job environments (#1379 ) ## Changes The main changes are: 1. Don't link artifacts to libraries anymore and instead just iterate over all jobs and tasks when uploading artifacts and update local path to remote 2. Iterating over `jobs.environments` to check if there are any local libraries and checking that they exist locally 3. Added tests to check environments are handled correctly End-to-end test will follow up ## Tests Added regression test, existing tests (including integration one) pass	2024-04-22 11:44:34 +00:00
Pieter Noordhuis	ed194668db	Return `diag.Diagnostics` from mutators (#1305 ) ## Changes This diagnostics type allows us to capture multiple warnings as well as errors in the return value. This is a preparation for returning additional warnings from mutators in case we detect non-fatal problems. * All return statements that previously returned an error now return `diag.FromErr` * All return statements that previously returned `fmt.Errorf` now return `diag.Errorf` * All `err != nil` checks now use `diags.HasError()` or `diags.Error()` ## Tests * Existing tests pass. * I confirmed no call site under `./bundle` or `./cmd/bundle` uses `errors.Is` on the return value from mutators. This is relevant because we cannot wrap errors with `%w` when calling `diag.Errorf` (like `fmt.Errorf`; context in https://github.com/golang/go/issues/47641).	2024-03-25 14:18:47 +00:00
Andrew Nester	ecf9c52f61	Support relative paths in artifact files source section and always upload all artifact files (#1247 ) Support relative paths in artifact files source section and always upload all artifact files Fixes #1156 ## Tests Added unit tests	2024-03-04 20:28:15 +00:00
Pieter Noordhuis	33c446dadd	Refactor library to artifact matching to not use pointers (#1172 ) ## Changes The approach to do this was: 1. Iterate over all libraries in all job tasks 2. Find references to local libraries 3. Store pointer to `compute.Library` in the matching artifact file to signal it should be uploaded This breaks down when introducing #1098 because we can no longer track unexported state across mutators. The approach in this PR performs the path matching twice; once in the matching mutator where we check if each referenced file has an artifacts section, and once during artifact upload to rewrite the library path from a local file reference to an absolute Databricks path. ## Tests Integration tests pass.	2024-02-05 15:29:45 +00:00
Lennart Kats (databricks)	875c9d2db1	Tune output of bundle deploy command (#1047 ) ## Changes Update the output of the `deploy` command to be more concise and consistent: ``` $ databricks bundle deploy Building my_project... Uploading my_project-0.0.1+20231207.205106-py3-none-any.whl... Uploading bundle files to /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files... Deploying resources... Updating deployment state... Deployment complete! ``` This does away with the intermediate success messages, makes consistent use of `...`, and only prints the success message at the very end after everything is completed. Below is the original output for comparison: ``` $ databricks bundle deploy Detecting Python wheel project... Found Python wheel project at /tmp/output/my_project Building my_project... Build succeeded Uploading my_project-0.0.1+20231207.205134-py3-none-any.whl... Upload succeeded Starting upload of bundle files Uploaded bundle files at /Users/lennart.kats@databricks.com/.bundle/my_project/dev/files! Starting resource deployment Resource deployment completed! ```	2023-12-21 08:00:37 +00:00
Andrew Nester	5431174302	Do not add wheel content hash in uploaded Python wheel path (#1015 ) ## Changes Removed hash from the upload path since it's not useful anyway. The main reason for that change was to make it work on all-purpose clusters. But in order to make it work, wheel version needs to be increased anyway. So having only hash in path is useless. Note: using --build-number (build tag) flag does not help with re-installing libraries on all-purpose clusters. The reason is that `pip` ignoring build tag when upgrading the library and only look at wheel version. Build tag is only used for sorting the versions and the one with higher build tag takes priority when installed. It only works if no library is installed. See `a15dd75d98/src/pip/_internal/index/package_finder.py (L522-L556)` https://github.com/pypa/pip/issues/4781 Thus, the only way to reinstall the library on all-purpose cluster is to increase wheel version manually or use automatic version generation, f.e. ``` setup( version=datetime.datetime.utcnow().strftime("%Y%m%d.%H%M%S"), ... ) ``` ## Tests Integration tests passed.	2023-11-29 10:40:12 +00:00
shreyas-goenka	0c837e5772	Make `file_path` and `artifact_path` fields consistent with json tag (#987 ) ## Changes This PR: 1. Renames `FilesPath` -> `FilePath` and `ArtifactsPath` -> `ArtifactPath` in the bundle and metadata configuration to make them consistant with the json tags. 2. Fixes development / production mode error messages to point to `file_path` and `artifact_path` ## Tests Existing unit tests. This is a strightforward renaming of the fields.	2023-11-15 13:37:26 +00:00
Andrew Nester	5273d0c51a	Support Python wheels larger than 10MB (#879 ) ## Changes Previously we only supported uploading Python wheels smaller than 10mb due to using Workspace.Import API and `content ` field https://docs.databricks.com/api/workspace/workspace/import By switching to use `WorkspaceFilesClient` we overcome the limit because it uses POST body for the API instead. ## Tests `TestAccUploadArtifactFileToCorrectRemotePath` integration test passes ``` === RUN TestAccUploadArtifactFileToCorrectRemotePath artifacts_test.go:28: gcp 2023/10/17 15:24:04 INFO Using Google Credentials sdk=true helpers.go:356: Creating /Users/.../integration-test-wsfs-ekggbkcfdkid artifacts.Upload(test.whl): Uploading... 2023/10/17 15:24:06 INFO Using Google Credentials mutator=artifacts.Upload(test) sdk=true artifacts.Upload(test.whl): Upload succeeded helpers.go:362: Removing /Users/.../integration-test-wsfs-ekggbkcfdkid --- PASS: TestAccUploadArtifactFileToCorrectRemotePath (5.66s) PASS coverage: 14.9% of statements in ./... ok github.com/databricks/cli/internal 6.109s coverage: 14.9% of statements in ./... ```	2023-10-18 10:20:43 +00:00
Andrew Nester	86c30dd328	Fixed artifact file uploading on Windows and wheel execution on DBR 13.3 (#722 ) ## Changes Fixed artifact file uploading on Windows and wheel execution on DBR 13.3 Fixes #719, #720 ## Tests Added regression test for Windows	2023-08-31 14:10:32 +00:00
Andrew Nester	9a88fa602d	Added support for artifacts building for bundles (#583 ) ## Changes Added support for artifacts building for bundles. Now it allows to specify `artifacts` block in bundle.yml and define a resource (at the moment Python wheel) to be build and uploaded during `bundle deploy` Built artifact will be automatically attached to corresponding job task or pipeline where it's used as a library Follow-ups: 1. If artifact is used in job or pipeline, but not found in the config, try to infer and build it anyway 2. If build command is not provided for Python wheel artifact, infer it	2023-07-25 13:35:08 +02:00

15 Commits