Commit Graph

129 Commits

Author SHA1 Message Date
Andrew Nester 70fe0e36ef
Added `databricks bundle generate job` command (#1043)
## Changes
Now it's possible to generate bundle configuration for existing job.
For now it only supports jobs with notebook tasks.

It will download notebooks referenced in the job tasks and generate
bundle YAML config for this job which can be included in larger bundle.

## Tests
Running command manually

Example of generated config
```
resources:
  jobs:
    job_128737545467921:
      name: Notebook job
      format: MULTI_TASK
      tasks:
        - task_key: as_notebook
          existing_cluster_id: 0704-xxxxxx-yyyyyyy
          notebook_task:
            base_parameters:
              bundle_root: /Users/andrew.nester@databricks.com/.bundle/job_with_module_imports/development/files
            notebook_path: ./entry_notebook.py
            source: WORKSPACE
          run_if: ALL_SUCCESS
      max_concurrent_runs: 1
 ```

## Tests
Manual (on our last 100 jobs) + added end-to-end test

```
--- PASS: TestAccGenerateFromExistingJobAndDeploy (50.91s)
PASS
coverage: 61.5% of statements in ./...
ok github.com/databricks/cli/internal/bundle 51.209s coverage: 61.5% of
statements in ./...
```
2024-01-17 14:26:33 +00:00
Andrew Nester 4b01fff03d
Fixed instance pool resolving by name (#1102)
## Changes
Fixed instance pool resolving by name

## Tests
Added regression test
2024-01-05 10:50:53 +00:00
Andrew Nester 5fb40f9d07
Allow referencing bundle resources by name (#872)
## Changes
Now we can define variables with values which reference different
Databricks resources by name.
When references like this, DABs automatically looks up the resource by
this name and replaces the reference with ID of the resource referenced.
Thus when the variable is used in the configuration it will contain the
correct resolved ID of resource.

The resolvers are code generated and thus DABs support referencing all
resources which has `GetByName`-like methods in Go SDK.

### Example

```
variables:
  my_cluster_id:
    description: An existing cluster.
    lookup: 
      cluster: "12.2 shared"

resources:
  jobs:
    my_job:
      name: "My Job"
      tasks:
        - task_key: TestTask
          existing_cluster_id: ${var.my_cluster_id}

targets:
  dev:
    variables:
      my_cluster_id:
        lookup: 
           cluster: "dev-cluster"
```

## Tests
Added unit test + manual testing

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-01-04 21:04:42 +00:00
Lennart Kats (databricks) 167deec8c3
Change recommended production deployment path from /Shared to /Users (#1091)
## Changes

This PR changes the default and `mode: production` recommendation to
target `/Users` for deployment. Previously, we used `/Shared`, but
because of a lack of POSIX-like permissions in WorkspaceFS this meant
that files inside would be readable and writable by other users in the
workspace.

Detailed change:
* `default-python` no longer uses a path that starts with `/Shared`
* `mode: production` no longer requires a path that starts with
`/Shared`
 
## Related PRs

Docs: https://github.com/databricks/docs/pull/14585
Examples: https://github.com/databricks/bundle-examples/pull/17

## Tests

* Manual tests
* Template unit tests (with an extra check to avoid /Shared)
2024-01-02 19:58:24 +00:00
Andrew Nester ac37a592f1
Added exec.NewCommandExecutor to execute commands with correct interpreter (#1075)
## Changes
Instead of handling command chaining ourselves, we execute passed
commands as-is by storing them, in temp file and passing to correct
interpreter (bash or cmd) based on OS.

Fixes #1065 

## Tests
Added unit tests
2023-12-21 15:45:23 +00:00
shreyas-goenka 6002f49c87
Move bundle schema update to an internal module (#1012)
## Changes

This PR:
1. Move code to load bundle JSON Schema descriptions from the OpenAPI
spec to an internal Go module
2. Remove command line flags from the `bundle schema` command. These
flags were meant for internal processes and at no point were meant for
customer use.
3. Regenerate `bundle_descriptions.json`
4. Add support for `bundle: "deprecated"`. The `environments` field is
tagged as deprecated in this PR and consequently will no longer be a
part of the bundle schema.

## Tests
Tested by regenerating the CLI against its current OpenAPI spec (as
defined in `__openapi_sha`). The `bundle_descriptions.json` in this PR
was generated from the code generator.

Manually checked that the autocompletion / descriptions from the new
bundle schema are correct.
2023-12-06 10:45:18 +00:00
shreyas-goenka 677926b78b
Fix panic when bundle auth resolution fails (#1002)
## Changes
CLI would panic if an invalid bundle auth is setup when running CLI
commands. This PR removes the panic and shows the error message directly
instead.

## Tests
The CWD is a bundle with:
```
workspace:
  profile: DEFAULT
```

Before:
```
shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list
panic: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT

goroutine 1 [running]:
```

After:
```
shreyas.goenka@THW32HFW6T bundle-playground % cli clusters list
Error: cannot resolve bundle auth configuration: resolve: /Users/shreyas.goenka/.databrickscfg has no DEFAULT profile configured. Config: profile=DEFAULT
```

```
shreyas.goenka@THW32HFW6T bundle-playground % DATABRICKS_CONFIG_FILE=/dev/null cli bundle deploy
Error:  cannot resolve bundle auth configuration: resolve: /dev/null has no DEFAULT profile configured. Config: profile=DEFAULT, config_file=/dev/null. Env: DATABRICKS_CONFIG_FILE
```
2023-11-30 14:28:01 +00:00
Andrew Nester 4d8d825746
Fixed panic when job has trigger and in development mode (#1026)
## Changes
Fixed panic when job has trigger and in development mode
2023-11-29 16:32:42 +00:00
Andrew Nester 833746cbdd
Do not replace pipeline libraries if there are no matches for pattern (#1021)
## Changes
If there are no matches when doing Glob call for pipeline library
defined, leave the entry as is.
The next mutators in the chain will detect that file is missing and the
error will be more user friendly.


Before the change

```
Starting resource deployment
Error: terraform apply: exit status 1

Error: cannot create pipeline: libraries must contain at least one element
```

After

```
Error: notebook ./non-existent not found
```


## Tests
Added regression unit tests
2023-11-29 13:20:13 +00:00
Andrew Nester fa89db57e9
Enable `spark_jar_task` with local JAR libraries (#993)
## Changes
Previously local JAR paths were transformed to remote path during
initialisation and thus artifact building logic did not recognise such
libraries as local to be handled and uploaded.

Now it's possible to use spark_jar_tasks with local JAR libraries on
14.1+ DBR clusters

Example configuration
```
bundle:
  name: spark-jar

workspace:
  host: ***

artifacts:
  my_java_code:
    path: ./sample-java
    build: "javac PrintArgs.java && jar cvfm PrintArgs.jar META-INF/MANIFEST.MF PrintArgs.class"
    files:
      - source: "/Users/andrew.nester/dabs/wheel/sample-java/PrintArgs.jar"

resources:
  jobs:
    print_args:
      name: "Print Args"
      tasks:
        - task_key: Print
          new_cluster:
            num_workers: 0
            spark_version: 14.2.x-scala2.12
            node_type_id: i3.xlarge
            spark_conf:
              "spark.databricks.cluster.profile": "singleNode"
              "spark.master": "local[*]"
            custom_tags:
              ResourceClass: "SingleNode"
          spark_jar_task:
            main_class_name: PrintArgs
          libraries:
            - jar: ./sample-java/PrintArgs.jar
```

## Tests
Manually running `bundle deploy and bundle run`
2023-11-21 10:15:09 +00:00
Pieter Noordhuis 489d6fa1b8
Replace direct calls with `bundle.Apply` (#990)
## Changes

Some test call sites called directly into the mutator's `Apply` function
instead of `bundle.Apply`. Calling into `bundle.Apply` is preferred
because that's where we can run pre/post logic common across all
mutators.

## Tests

Pass.
2023-11-15 14:19:18 +00:00
Pieter Noordhuis d80c35f66a
Rename variable `bundle -> b` (#989)
## Changes

All calls to apply a mutator must go through `bundle.Apply`. This
conflicts with the existing use of the variable `bundle`. This change
un-aliases the variable from the package name by renaming all variables
to `b`.

## Tests

Pass.
2023-11-15 14:03:36 +00:00
shreyas-goenka 0c837e5772
Make `file_path` and `artifact_path` fields consistent with json tag (#987)
## Changes
This PR:
1. Renames `FilesPath` -> `FilePath` and `ArtifactsPath` ->
`ArtifactPath` in the bundle and metadata configuration to make them
consistant with the json tags.
2. Fixes development / production mode error messages to point to
`file_path` and `artifact_path`

## Tests
Existing unit tests. This is a strightforward renaming of the fields.
2023-11-15 13:37:26 +00:00
Lennart Kats (databricks) 0ab125c109
Allow jobs to be manually unpaused in development mode (#885)
Partly mitigates #859. It's still not clear to me if there is an actual
use case or if users are trying to use "development" mode jobs for
production, but making this overridable is reasonable.

Beyond this fix I think we could do something in the Jobs schedule UI,
but it would help to better understand the use case (or actual reason of
confusion). I expect we should hint customers to move away from dev mode
rather than unpause.
2023-11-13 19:50:39 +00:00
Andrew Nester f3db42e622
Added support for top-level permissions (#928)
## Changes
Now it's possible to define top level `permissions` section in bundle
configuration and permissions defined there will be applied to all
resources defined in the bundle.

Supported top-level permission levels: CAN_MANAGE, CAN_VIEW, CAN_RUN.

Permissions are applied to: Jobs, DLT Pipelines, ML Models, ML
Experiments and Model Service Endpoints

```
bundle:
  name: permissions

workspace:
  host: ***

permissions:
  - level: CAN_VIEW
    group_name: test-group
  - level: CAN_MANAGE
    user_name: user@company.com
  - level: CAN_RUN
    service_principal_name: 123456-abcdef
```

## Tests
Added corresponding unit tests + ran `bundle validate` and `bundle
deploy` manually
2023-11-13 11:29:40 +00:00
Pieter Noordhuis 7847388f95
Initialize variable definitions that are defined without properties (#966)
## Changes

We can debate whether or not variable definitions without properties are
valid, but in no case should this panic the CLI.

Fixes #934.

## Tests

Unit.
2023-11-08 11:01:14 +00:00
Michał Szafrański 10291b0e13
Bundle path rewrites for dbt and SQL file tasks (#962)
## Changes
Support path rewrites for Dbt and SQL file job taks.
<!-- Summary of your changes that are easy to understand -->

## Tests
* Added unit test
<!-- How is this tested? -->
2023-11-07 20:00:09 +00:00
shreyas-goenka 5a8cd0c5bc
Persist deployment metadata in WSFS (#845)
## Changes

This PR introduces a metadata struct that stores a subset of bundle
configuration that we wish to expose to other Databricks services that
wish to integrate with bundles.

This metadata file is uploaded to a file
`${bundle.workspace.state_path}/metadata.json` in the WSFS destination
of the bundle deployment.

Documentation for emitted metadata fields:
* `version`: Version for the metadata file schema
* `config.bundle.git.branch`: Name of the git branch the bundle was
deployed from.
* `config.bundle.git.origin_url`: URL for git remote "origin"
* `config.bundle.git.bundle_root_path`: Relative path of the bundle root
from the root of the git repository. Is set to "." if they are the same.
* `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this
bundle was deployed from. Note, the deployment might not exactly match
this commit version if there are changes that have not been committed to
git at deploy time,
* `file_path`: Path in workspace where we sync bundle files to. 
* `resources.jobs.[job-ref].id`: Id of the job
* `resources.jobs.[job-ref].relative_path`: Relative path of the yaml
config file from the bundle root where this job was defined.

Example metadata object when bundle root and git root are the same:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "master",
        "origin_url": "www.host.com",
        "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0",
        "bundle_root_path": "."
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```

Example metadata when the git root is one level above the bundle repo:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "dev-branch",
        "origin_url": "www.my-repo.com",
        "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b",
        "bundle_root_path": "pipeline-progress"
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```


This unblocks integration to the jobs break glass UI for bundles.

## Tests
Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
Andrew Nester 6f22ae8696
Use UserName instead of Id to check if identity used is a service principal (#924)
## Changes
Use UserName instead of Id to check if identity used is a service
principal
2023-10-26 14:58:16 +00:00
Pieter Noordhuis 6e21ced54a
Consolidate bundle configuration loader function (#918)
## Changes

There were two functions related to loading a bundle configuration file;
one as a package function and one as a member function on the
configuration type. Loading the same configuration object twice doesn't
make sense and therefore we can consolidate to only using the package
function.

The package function would scan for known file names if the specified
path was a directory. This functionality was not in use because the
top-level bundle loader figures out the filename itself as of #580.

## Tests

Pass.
2023-10-25 12:55:56 +00:00
Pieter Noordhuis 486bf59627
Move bundle configuration filename code (#917)
## Changes

This is unrelated to the config root so belongs in a separate file (this
was added in #580).

## Tests

n/a
2023-10-25 09:54:39 +00:00
Pieter Noordhuis d4be40520c
Resolve configuration before performing verification (#890)
## Changes

If a bundle configuration specifies a workspace host, and the user
specifies a profile to use, we perform a check to confirm that the
workspace host in the bundle configuration and the workspace host from
the profile are identical. If they are not, we return an error. The
check was introduced in #571.

Previously, the code included an assumption that the client
configuration was already loaded from the environment prior to
performing the check. This was not the case, and as such if the user
intended to use a non-default path to `.databrickscfg`, this path was
not used when performing the check.

The fix does the following:
* Resolve the configuration prior to performing the check.
* Don't treat the configuration file not existing as an error.
* Add unit tests.

Fixes #884.

## Tests

Unit tests and manual confirmation.
2023-10-20 13:10:31 +00:00
Arpit Jasapara 24cc67563e
Support Unity Catalog Registered Models in bundles (#846)
## Changes
<!-- Summary of your changes that are easy to understand -->
Add UC Registered Models support to Databricks Asset Bundles as new
resource `registered_model`. Also added UC Permission support via new
resource `grant`.

## Tests
<!-- How is this tested? -->
Tested via unit tests and manual testing with [example
PR](https://github.com/databricks/bundle-examples-internal/pull/80) and
[custom Terraform
provider](https://github.com/databricks/terraform-provider-databricks/pull/2771).
<img width="698" alt="Screenshot 2023-10-08 at 4 57 23 PM"
src="https://github.com/databricks/cli/assets/87999496/bcf605a9-7894-443b-865a-f7e240037815">
<img width="1109" alt="Screenshot 2023-10-08 at 4 56 47 PM"
src="https://github.com/databricks/cli/assets/87999496/e4d6e424-cd70-4809-8843-6939ed2e172f">
<img width="1091" alt="Screenshot 2023-10-08 at 4 56 57 PM"
src="https://github.com/databricks/cli/assets/87999496/88ebaabb-67db-4a11-88a5-df087e2e41c0">

---------

Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>
Co-authored-by: Andrew Nester <andrew.nester.dev@gmail.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-10-16 15:32:49 +00:00
Andrew Nester 30c4d2e8a7
Fixed merging task libraries from targets (#868)
## Changes
Previous we (erroneously) kept the reference and merged into the
original tasks and not the copies which we later used to replace
existing tasks. Thus the merging of slices and references was incorrect.

Fixes #864 

## Tests
Added a regression test
2023-10-16 08:48:32 +00:00
hectorcast-db 36f30c8b47
Update Go SDK to 0.23.0 and use custom marshaller (#772)
## Changes
Update Go SDK to 0.23.0 and use custom marshaller.
## Tests
* Run unit tests

* Run nightly

* Manual test:
```
./cli jobs create --json @myjob.json
```
with 
```
{
    "name": "my-job-marshal-test-go",
    "tasks": [{
        "task_key": "testgomarshaltask",
        "new_cluster": {
            "num_workers": 0,
            "spark_version": "10.4.x-scala2.12",
            "node_type_id": "Standard_DS3_v2"
        },
        "libraries": [
            {
                "jar": "dbfs:/max/jars/exampleJarTask.jar"
            }
        ],
        "spark_jar_task": {
            "main_class_name":  "com.databricks.quickstart.exampleTask"
        }
    }]
}
```
Main branch:
```
Error: Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size
```
This branch:
```
{
  "job_id":<jobid>
}
```

---------

Co-authored-by: Miles Yucht <miles@databricks.com>
2023-10-16 06:56:06 +00:00
Andrew Nester 943ea89728
Allow target overrides for sync section (#856)
## Changes
Allow target overrides for sync section

## Tests
Added tests
2023-10-10 15:18:18 +00:00
Andrew Nester 8d8de3f509
Fixed using repo files as pipeline libraries (#847)
## Changes
Fixed using repo files as pipeline libraries

## Tests
Added regression test
2023-10-09 10:10:28 +00:00
Andrew Nester aa54a8665a
Added support for glob patterns in pipeline libraries section (#833)
## Changes
Now it's possible to specify glob pattern in pipeline libraries section
and DAB will add all matched files as libraries

```
  pipelines:
    dummy:
      name: " DLT with Python files"
      target: "dlt_python_files"
      libraries:
        - file:
            path: ./*.py
```

## Tests
Added unit test
2023-10-04 13:23:13 +00:00
Andrew Nester 9b6a847178
Mark artifacts properties as optional (#834)
## Changes
Mark artifacts properties as optional

Fixes #816
2023-10-03 13:59:28 +00:00
Pieter Noordhuis f1b068cefe
Use normalized short name for tag value in development mode (#821)
## Changes

The jobs backend propagates job tags to the underlying cloud provider's
resources. As such, they need to match the constraints a cloud provider
places on tag values. The display name can contain anything. With this
change, we modify the tag value to equal the short name as used in the
name prefix.

Additionally, we leverage tag normalization as introduced in #819 to
make sure characters that aren't accepted are removed before using the
value as a tag value.

This is a new stab at #810 and should completely eliminate this class of
problems.

## Tests

Tests pass.
2023-10-02 06:58:51 +00:00
Pieter Noordhuis 30b4b8ce58
Allow digits in the generated short name (#820)
## Changes

Digits were previously replaced by `_`.

## Tests

Additional test cases with uncommon variations of email addresses.
2023-09-29 06:58:40 +00:00
Serge Smertin 7171874db0
Added `process.Background()` and `process.Forwarded()` (#804)
## Changes
This PR adds higher-level wrappers for calling subprocesses. One of the
steps to get https://github.com/databricks/cli/pull/637 in, as
previously discussed.

The reason to add `process.Forwarded()` is to proxy Python's `input()`
calls from a child process seamlessly. Another use-case is plugging in
`less` as a pager for the list results.

## Tests
`make test`
2023-09-27 09:04:44 +00:00
Andrew Nester 0daa0022af
Make a notebook wrapper for Python wheel tasks optional (#797)
## Changes
Instead of always using notebook wrapper for Python wheel tasks, let's
make this an opt-in option.

Now by default Python wheel tasks will be deployed as is to Databricks
platform.
If notebook wrapper required (DBR < 13.1 or other configuration
differences), users can provide a following experimental setting

```
experimental:
  python_wheel_wrapper: true
```

Fixes #783,
https://github.com/databricks/databricks-asset-bundles-dais2023/issues/8

## Tests
Added unit tests.

Integration tests passed for both cases

```
    helpers.go:163: [databricks stdout]: Hello from my func
    helpers.go:163: [databricks stdout]: Got arguments:
    helpers.go:163: [databricks stdout]: ['my_test_code', 'one', 'two']
    ...
Bundle remote directory is ***/.bundle/ac05d5e8-ed4b-4e34-b3f2-afa73f62b021
Deleted snapshot file at /var/folders/nt/xjv68qzs45319w4k36dhpylc0000gp/T/TestAccPythonWheelTaskDeployAndRunWithWrapper3733431114/001/.databricks/bundle/default/sync-snapshots/cac1e02f3941a97b.json
Successfully deleted files!
--- PASS: TestAccPythonWheelTaskDeployAndRunWithWrapper (214.18s)
PASS
coverage: 93.5% of statements in ./...
ok      github.com/databricks/cli/internal/bundle       214.495s        coverage: 93.5% of statements in ./...

```

```
    helpers.go:163: [databricks stdout]: Hello from my func
    helpers.go:163: [databricks stdout]: Got arguments:
    helpers.go:163: [databricks stdout]: ['my_test_code', 'one', 'two']
    ...
Bundle remote directory is ***/.bundle/0ef67aaf-5960-4049-bf1d-dc9e29157421
Deleted snapshot file at /var/folders/nt/xjv68qzs45319w4k36dhpylc0000gp/T/TestAccPythonWheelTaskDeployAndRunWithoutWrapper2340216760/001/.databricks/bundle/default/sync-snapshots/edf0b322cee93b13.json
Successfully deleted files!
--- PASS: TestAccPythonWheelTaskDeployAndRunWithoutWrapper (192.36s)
PASS
coverage: 93.5% of statements in ./...
ok      github.com/databricks/cli/internal/bundle       195.130s        coverage: 93.5% of statements in ./...

```
2023-09-26 14:32:20 +00:00
Pieter Noordhuis ee30277119
Enable target overrides for pipeline clusters (#792)
## Changes

This is a follow-up to #658 and #779 for jobs.

This change applies label normalization the same way the backend does.

## Tests

Unit and config loading tests.
2023-09-21 19:21:20 +00:00
Andrew Nester 43e2eefc27
Enable environment overrides for job tasks (#779)
## Changes
Follow up for https://github.com/databricks/cli/pull/658

When a job definition has multiple job tasks using the same key, it's
considered invalid. Instead we should combine those definitions with the
same key into one. This is consistent with environment overrides. This
way, the override ends up in the original job tasks, and we've got a
clear way to put them all together.

## Tests
Added unit tests
2023-09-18 14:13:50 +00:00
Andrew Nester 953dcb4972
Added support for experimental scripts section (#632)
## Changes
Added support for experimental scripts section

It allows execution of arbitrary bash commands during certain bundle
lifecycle steps.

## Tests
Example of configuration

```yaml
bundle:
  name: wheel-task


workspace:
  host: ***

experimental:
  scripts:
    prebuild: |
      echo 'Prebuild 1'
      echo 'Prebuild 2'
    postbuild: "echo 'Postbuild 1' && echo 'Postbuild 2'" 
    predeploy: |
      echo 'Checking go version...'
      go version
    postdeploy: |
      echo 'Checking python version...'
      python --version

resources:
  jobs:
    test_job:
      name: "[${bundle.environment}] My Wheel Job"
      tasks:
        - task_key: TestTask
          existing_cluster_id: "***"
          python_wheel_task:
            package_name: "my_test_code"
            entry_point: "run"
          libraries:
          - whl: ./dist/*.whl
```

Output
```bash
andrew.nester@HFW9Y94129 wheel % databricks bundle deploy
artifacts.whl.AutoDetect: Detecting Python wheel project...
artifacts.whl.AutoDetect: Found Python wheel project at /Users/andrew.nester/dabs/wheel
'Prebuild 1'
'Prebuild 2'

artifacts.whl.Build(my_test_code): Building...
artifacts.whl.Build(my_test_code): Build succeeded
'Postbuild 1'
'Postbuild 2'

'Checking go version...'
go version go1.19.9 darwin/arm64

Starting upload of bundle files
Uploaded bundle files at /Users/andrew.nester@databricks.com/.bundle/wheel-task/default/files!

artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Uploading...
artifacts.Upload(my_test_code-0.0.0a0-py3-none-any.whl): Upload succeeded
Starting resource deployment
Resource deployment completed!
'Checking python version...'
Python 2.7.18
```
2023-09-14 10:14:13 +00:00
shreyas-goenka 373f441eb2
Use clearer error message when no interpolation value is found. (#764)
## Changes
This PR makes the error message clearer for when interpolation fails. 

## Tests
Existing unit test and manually
2023-09-11 15:23:25 +00:00
Pieter Noordhuis 4ccc70aeac
Consolidate environment variable interaction (#747)
## Changes

There are a couple places throughout the code base where interaction
with environment variables takes place. Moreover, more than one of these
would try to read a value from more than one environment variable as
fallback (for backwards compatibility). This change consolidates those
accesses.

The majority of diffs in this change are mechanical (i.e. add an
argument or replace a call).

This change:
* Moves common environment variable lookups for bundles to
`bundles/env`.
* Adds a `libs/env` package that wraps `os.LookupEnv` and `os.Getenv`
and allows for overrides to take place in a `context.Context`. By
scoping overrides to a `context.Context` we can avoid `t.Setenv` in
testing and unlock parallel test execution for integration tests.
* Updates call sites to pass through a `context.Context` where needed.
* For bundles, introduces `DATABRICKS_BUNDLE_ROOT` as new primary
variable instead of `BUNDLE_ROOT`. This was the last environment
variable that did not use the `DATABRICKS_` prefix.

## Tests

Unit tests pass.
2023-09-11 08:18:43 +00:00
shreyas-goenka 9a51f72f0b
Make bundle and sync fields optional (#757)
## Changes
This PR:
1. Makes the bundle and sync properties optional in the generated
schema.
2. Fixes schema generation that was broken due to a rogue "description"
field in the bundle docs.

## Tests
Tested manually. The generated schema no longer has "bundle" and "sync"
marked as required.
2023-09-11 08:16:22 +00:00
Andrew Nester b5d033d154
List available targets when incorrect target passed (#756)
## Changes
List available targets when incorrect target passed

## Tests
```
andrew.nester@HFW9Y94129 wheel % databricks bundle validate -t incorrect
Error: incorrect: no such target. Available targets: prod, development
```
2023-09-08 15:37:55 +00:00
Andrew Nester 18a5b05d82
Apply Python wheel trampoline if workspace library is used (#755)
## Changes
Workspace library will be detected by trampoline in 2 cases:
- User defined to use local wheel file
- User defined to use remote wheel file from Workspace file system

In both of these cases we should correctly apply Python trampoline

## Tests
Added a regression test (also covered by Python e2e test)
2023-09-08 13:45:21 +00:00
Andrew Nester e64463ba47
Fixed marking libraries from DBFS as remote (#750)
## Changes
Fixed marking libraries from DBFS as remote

## Tests
Updated unit tests to catch the regression
2023-09-08 09:53:57 +00:00
Arpit Jasapara 50eaf16307
Support Model Serving Endpoints in bundles (#682)
## Changes
<!-- Summary of your changes that are easy to understand -->
Add Model Serving Endpoints to Databricks Bundles

## Tests
<!-- How is this tested? -->
Unit tests and manual testing via
https://github.com/databricks/bundle-examples-internal/pull/76
<img width="1570" alt="Screenshot 2023-08-28 at 7 46 23 PM"
src="https://github.com/databricks/cli/assets/87999496/7030ebd8-b0e2-4ad1-a9e3-5ff8454f1175">
<img width="747" alt="Screenshot 2023-08-28 at 7 47 01 PM"
src="https://github.com/databricks/cli/assets/87999496/fb9b54d7-54e2-43ce-9148-68fb620c809a">

Signed-off-by: Arpit Jasapara <arpit.jasapara@databricks.com>
2023-09-07 21:54:31 +00:00
Lennart Kats (databricks) f9e521b43e
databricks bundle init template v2: optional stubs, DLT support (#700)
## Changes

This follows up on https://github.com/databricks/cli/pull/686. This PR
makes our stubs optional + it adds DLT stubs:

```
$ databricks bundle init
Template to use [default-python]: default-python
Unique name for this project [my_project]: my_project
Include a stub (sample) notebook in 'my_project/src' [yes]: yes
Include a stub (sample) DLT pipeline in 'my_project/src' [yes]: yes
Include a stub (sample) Python package 'my_project/src' [yes]: yes
 Successfully initialized template
```

## Tests
Manual testing, matrix tests.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-06 09:52:31 +00:00
Lennart Kats (databricks) 947d5b1e5c
Fix IsServicePrincipal() only working for workspace admins (#732)
## Changes

The latest rendition of isServicePrincipal no longer worked for
non-admin users as it used the "principals get" API.

This new version relies on the property that service principals always
have a UUID as their userName. This was tested with the eng-jaws
principal (8b948b2e-d2b5-4b9e-8274-11b596f3b652).
2023-09-05 11:20:55 +00:00
Andrew Nester 83443bae8d
Make resource and artifact paths in bundle config relative to config folder (#708)
# Warning: breaking change

## Changes
Instead of having paths in bundle config files be relative to bundle
root even if the config file is nested, this PR makes such paths
relative to the folder where the config is located.

When bundle is initialised, these paths will be transformed to relative
paths based on bundle root. For example,
we have file structure like this
```
- mybundle
| - bundle.yml
| - subfolder
| -- resource.yml
| -- my.whl
```

Previously, we had to reference `my.whl` in resource.yml like this,
which was confusing because resource.yml is in the same subfolder
```
sync:
  include:
    - ./subfolder/*.whl
...
tasks:
  - task_key: name
    libraries:
      - whl: ./subfolder/my.whl
...
```

After the change we can reference it like this (which is in line with
the current behaviour for notebooks)

```
sync:
  include:
    - ./*.whl
...
tasks:
  - task_key: name
    libraries:
      - whl: ./my.whl
...
```

## Tests
Existing `translate_path_tests` successfully passed after refactoring.

Added a couple of uses cases for `Libraries` paths.

Added a bundle config tests with include config and sync section

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-04 09:55:01 +00:00
Lennart Kats (databricks) e22fd73b7d
Cleanup after previous PR comments (#724)
## Changes

@pietern this addresses a comment from you on a recently merged PR. It
also updates settings.json based on the settings VS Code adds as soon as
you edit a notebook.
2023-09-04 07:07:17 +00:00
Lennart Kats (databricks) 707fd6f617
Cleanup after "Add a foundation for built-in templates" (#707)
## Changes
Add some cleanup based on @pietern's comments on
https://github.com/databricks/cli/pull/685
2023-08-30 14:01:08 +00:00
Andrew Nester 12368e3382
Added transformation mutator for Python wheel task for them to work on DBR <13.1 (#635)
## Changes
***Note: this PR relies on sync.include functionality from here:
https://github.com/databricks/cli/pull/671***

Added transformation mutator for Python wheel task for them to work on
DBR <13.1

Using wheels upload to Workspace file system as cluster libraries is not
supported in DBR < 13.1

In order to make Python wheel work correctly on DBR < 13.1 we do the
following:
1. Build and upload python wheel as usual
2. Transform python wheel task into special notebook task which does the
following
a. Installs all necessary wheels with %pip magic
b. Executes defined entry point with all provided parameters
3. Upload this notebook file to workspace file system
4. Deploy transformed job task

This is also beneficial for executing on existing clusters because this
notebook always reinstall wheels so if there are any changes to the
wheel package, they are correctly picked up

## Tests
bundle.yml

```yaml
bundle:
  name: wheel-task

workspace:
  host: ****

resources:
  jobs:
    test_job:
      name: "[${bundle.environment}] My Wheel Job"
      tasks:
        - task_key: TestTask
          existing_cluster_id: "***"
          python_wheel_task:
            package_name: "my_test_code"
            entry_point: "run"
            parameters: ["first argument","first value","second argument","second value"]
          libraries:
          - whl: ./dist/*.whl
```

Output
```
andrew.nester@HFW9Y94129 wheel % databricks bundle run test_job
Run URL: ***

2023-08-03 15:58:04 "[default] My Wheel Job" TERMINATED SUCCESS 
Output:
=======
Task TestTask:
Hello from my func
Got arguments v1:
['python', 'first argument', 'first value', 'second argument', 'second value']

```
2023-08-30 12:21:39 +00:00
Lennart Kats (databricks) 861f33d376
Support cluster overrides with cluster_key and compute_key (#696)
## Changes

Support `databricks bundle deploy --compute-id my_all_purpose_cluster`
in two missing cases.
2023-08-28 07:51:35 +00:00