Commit Graph

18 Commits

Author SHA1 Message Date
shreyas-goenka 28b39cd3f7
Make bundle JSON schema modular with `$defs` (#1700)
## Changes
This PR makes sweeping changes to the way we generate and test the
bundle JSON schema. The main benefits are:

1. More modular JSON schema. Every definition in the schema now is one
level deep and points to references instead of inlining the entire
schema for a field. This unblocks PyDABs from taking a dependency on the
JSON schema.

2. Generate the JSON schema during CLI code generation. Directly stream
it instead of computing it at runtime whenever a user calls `databricks
bundle schema`. This is nice because we no longer need to embed a
partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()`
method to every struct in the Databricks Go SDK and remove the
dependency on the OpenAPI spec altogether. It'll become more important
once we decouple Go SDK structs and methods from the underlying APIs.

3. Add enum values for Go SDK fields in the JSON schema. Better
autocompletion and validation for these fields. As a follow-up, we can
add enum values for non-Go SDK enums as well (created internal ticket to
track).

4. Use "packageName.structName" as a key to read JSON schemas from the
OpenAPI spec for Go SDK structs. Before, we would use an unrolled
presentation of the JSON schema (stored in `bundle_descriptions.json`),
which was complex to parse and include in the final JSON schema output.
This also means loading values from the OpenAPI spec for `target` schema
works automatically and no longer needs custom code.
5. Support recursive types (eg: `for_each_task`). With us now using
$refs everywhere it's trivial to support.
6. Using complex variables would be invalid according to the schema
generated before this PR. Now that bug is fixed. In the future adding
more custom rules will be easier as well due to the single level nature
of the JSON schema.


Since this is a complete change of approach in how we generate the JSON
schema, there are a few (very minor) regressions worth calling out.
1. We'll lose a few custom descriptions for non Go SDK structs that were
a part of `bundle_descriptions.json`. Support for those can be added in
the future as a followup.
2. Since now the final JSON schema is a static artefact, we lose some
lead time for the signal that JSON schema integration tests are failing.
It's okay though since we have a lot of coverage via the existing unit
tests.

## Tests
Unit tests. End to end tests are being added in this PR:
https://github.com/databricks/cli/pull/1726

Previous unit tests were all deleted because they were bloated. Effort
was made to make the new unit tests provide (almost) equivalent
coverage.
2024-09-10 13:55:18 +00:00
shreyas-goenka d949f2b4f2
Fix bundle schema for variables (#1396)
## Changes
This PR fixes the variable schema to:
1. Allow non-string values in the "default" value of a variable.
2. Allow non-string overrides in a target for a variable. 

## Tests
Manually. There are no longer squiggly lines. 

Before:
<img width="329" alt="Screenshot 2024-04-24 at 3 26 43 PM"
src="https://github.com/databricks/cli/assets/88374338/43be02c2-80a4-4f80-bd79-0f3e1e93ee17">


After:
<img width="361" alt="Screenshot 2024-04-24 at 3 26 10 PM"
src="https://github.com/databricks/cli/assets/88374338/2c1fb892-a2a2-478b-8d2e-9bda6d844b54">
2024-04-25 11:23:50 +00:00
Arpit Jasapara ce8cfef19d
Add support for `anyOf` to `skip_prompt_if` (#1133)
## Changes
This PR:
Introduces `anyOf` to `skip_prompt_if`. This allows you to make OR
conditionals for skipping prompts during template initialization.

## Tests
Added unit test and confirmed existing ones still work. Also tested
manually.

---------

Co-authored-by: Shreyas Goenka <shreyas.goenka@databricks.com>
2024-01-25 10:09:42 +00:00
shreyas-goenka f2408eda62
Add support for reprompts if user input does not match template schema (#946)
## Changes
This PR adds retry logic to user input prompts, prompting users again if
the value does not match the requirements specified in the bundle
template schema.

## Tests
Manually. Here's an example UX. The first prompt expects an integer and
the second one a string made only from the letters "defg"

```
shreyas.goenka@THW32HFW6T cli % cli bundle init ~/mlops-stack

Please enter an integer [123]: abc
Validation failed: "abc" is not a integer

Please enter an integer [123]: 123

Please enter a string [dddd]: apple
Validation failed: invalid value for input_root_dir: "apple". Only characters the 'd', 'e', 'f', 'g' are allowed
```
2023-12-22 15:43:08 +00:00
shreyas-goenka bdef0f7b23
Add support for conditional prompting in bundle init (#971)
## Changes
This PR introduces the `skip_prompt_if` extension to the jsonschema
library. If the inputs provided by the user match the JSON schema then
the prompt for that property is skipped.

Right now only constant checks are supported, but if in the future more
complicated conditionals are required, this can be extended to support
`allOf`, `oneOf`, `anyOf` etc allowing template authors to specify
conditionals of arbitary complexity.

## Tests
Unit tests and manually.
2023-11-30 16:07:45 +00:00
shreyas-goenka 1f1ed6db53
Add versioning for bundle templates (#972)
## Changes
This PR adds versioning for bundle templates. Right now there's only
logic for the maximum version of templates supported. At some point in
the future if we make a breaking template change we can also include a
minimum version of template supported by the CLI.

## Tests
Unit tests.
2023-11-30 14:28:51 +00:00
shreyas-goenka 283f24179d
Remove validation for default value against pattern (#959)
## Changes
This PR removes validation for default value against the regex pattern
specified in a JSON schema at schema load time. This is required because
https://github.com/databricks/cli/pull/795 introduces parameterising the
default value as a Go text template impling that the default value now
does not necessarily have to match the pattern at schema load time.

This will also unblock:
https://github.com/databricks/mlops-stacks/pull/108

Note, this does not remove runtime validation for input parameters right
before template initialization, which happens here:
fb32e78c9b/libs/template/materialize.go (L76)

## Tests
Changes to existing test.
2023-11-07 12:35:59 +00:00
shreyas-goenka fb32e78c9b
Make to/from string methods private to the jsonschema package (#942)
## Changes
This PR makes a few methods private, exposing cleaner interfaces to get
the string representations for enums and default values of a JSON
Schema.

## Tests
Manually, template initialization for the `default-python` template
still works as expected.
2023-11-06 15:05:17 +00:00
shreyas-goenka a5815a0b47
Add welcome message to bundle templates (#907)
## Changes
Adds a welcome_message field to templates and the default python
template.

## Tests
Manually.

Here's the output logs during template init now:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init                              
Template to use [default-python]: 

Welcome to the sample Databricks Asset Bundle template! Please enter the following information to initialize your sample DAB.

Unique name for this project [my_project]: abcde
Include a stub (sample) notebook in 'abcde/src': no
Include a stub (sample) Delta Live Tables pipeline in 'abcde/src': yes
Include a stub (sample) Python package in 'abcde/src': no

 Your new project has been created in the 'abcde' directory!

Please refer to the README.md of your project for further instructions on getting started.
Or read the documentation on Databricks Asset Bundles at https://docs.databricks.com/dev-tools/bundles/index.html.
```
2023-10-25 12:27:25 +00:00
shreyas-goenka f8d7e31118
Fix pattern validation for input properties (#912)
## Changes
Fixes bug where input validation would only be done on the first input
parameter in the template schema.

## Tests
Unit test.
2023-10-24 15:56:54 +00:00
shreyas-goenka 3700785dfa
Add support for validating CLI version when loading a jsonschema object (#883)
## Changes
Updates to bundle templates can require updated versions of the CLI.
This PR extends the JSON schema representation to allow template authors
to set a min CLI version they require for their templates.

This is required to make improvements/additions to the mlops-stacks repo

## Tests
Tested using unit tests and manually. 

For manualy testing, I created a custom build of the CLI using go
releaser and then tested it against a local instance of mlops-stack
When mlops-stack schema has:
```
  "min_databricks_cli_version": "v5000.1.1",
```

output (error as expected)
```
shreyas.goenka@THW32HFW6T bricks % ./dist/cli_darwin_arm64/databricks bundle init  ~/mlops-stack
Error: minimum CLI version "v5000.1.1" is greater than current CLI version "v0.207.2-dev+1b992c0". Please upgrade your current Databricks CLI
```

When the mlops-stack schema has:
```
  "min_databricks_cli_version": "v0.1.1",
```

output (validation passes)
```
shreyas.goenka@THW32HFW6T bricks % ./dist/cli_darwin_arm64/databricks bundle init  ~/mlops-stack
Welcome to MLOps Stack. For detailed information on project generation, see the README at https://github.com/databricks/mlops-stack/blob/main/README.md.

Project Name [my-mlops-project]: ^C
```
2023-10-19 14:01:48 +00:00
Lennart Kats (databricks) a2ee8bb45b
Improve the output of the `databricks bundle init` command (#795)
Improve the output of help, prompts, and so on for `databricks bundle
init` and the default template.

Among other things, this PR adds support for a new `welcome_message`
property that lets a template print a custom message on success:

```
$ databricks bundle init
Template to use [default-python]:
Unique name for this project [my_project]: lennart_project
Include a stub (sample) notebook in 'lennart_project/src': yes
Include a stub (sample) Delta Live Tables pipeline in 'lennart_project/src': yes
Include a stub (sample) Python package in 'lennart_project/src': yes

 Your new project has been created in the 'lennart_project' directory!

Please refer to the README.md of your project for further instructions on getting started.
Or read the documentation on Databricks Asset Bundles at https://docs.databricks.com/dev-tools/bundles/index.html.
```

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-10-19 07:08:36 +00:00
shreyas-goenka 757d5efe8d
Add support for regex patterns in template schema (#768)
## Changes
This PR introduces support for regex pattern validation in our custom
jsonschema validator. This allows us to fail early if a user enters an
invalid value for a field.

For example, now this is what initializing the default template looks
like with an invalid project name:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init
Template to use [default-python]: 
Unique name for this project [my_project]: (_*_)
Error: invalid value for project_name: (_*_). Must consist of letter and underscores only.
```

## Tests
New unit tests and manually.
2023-09-25 09:53:38 +00:00
shreyas-goenka 7c96270db8
Add enum support for bundle templates (#668)
## Changes
This PR includes:
1. Adding enum field to the json schema struct
2. Adding prompting logic for enum values. See demo for how it looks
3. Validation rules, validating the default value and config values when
an enum list is specified

This will now enable template authors to use enums for input parameters.

## Tests
Manually and new unit tests
2023-09-08 12:07:22 +00:00
shreyas-goenka 1a7bf4e4f1
Add schema and config validation to jsonschema package (#740)
## Changes

At a high level this PR adds new schema validation and moves
functionality that should be present in the jsonschema package, but
resides in the template package today, to the jsonschema package. This
includes for example schema validation, schema instance validation, to /
from string conversion methods etc.

The list below outlines all the pieces that have been moved over, and
the new validation bits added.

This PR:
1. Adds casting default value of schema properties to integers to the
jsonschema.Load method.
2. Adds validation for default value types for schema properties,
checking they are consistant with the type defined.
3. Introduces the LoadInstance and ValidateInstance methods to the json
schema package. These methods can be used to read and validate JSON
documents against the schema.
4. Replaces validation done for template inputs to use the newly defined
JSON schema validation functions.
5. Moves to/from string and isInteger utility methods to the json schema
package.

## Tests
Existing and new unit tests.
2023-09-07 14:36:06 +00:00
shreyas-goenka bbbeabf98c
Add support for ordering of input prompts (#662)
## Changes

JSON schema properties are a map and thus unordered.

This PR introduces a JSON schema extension field called `order` to allow
template authors to define the order in which template variables should
be resolved/prompted.

## Tests

Unit tests.

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-05 11:08:25 +00:00
shreyas-goenka 878bb6deae
Return better error messages for invalid JSON schema types in templates (#661)
## Changes
Adds a function to validate json schema types added by the author. The
default json unmarshaller does not validate that the parsed type matches
the enum defined in `jsonschema.Type`

Includes some other improvements to provide better error messages.

This PR was prompted by usability difficulties reported by @mingyu89
during mlops stack migration.

## Tests
Unit tests
2023-08-15 14:28:04 +00:00
shreyas-goenka 5df8935de4
Add JSON schema validation for input template parameters (#598)
## Changes

This PR:
1. Adds code for reading template configs and validating them against a
JSON schema.
2. Moves the json schema struct in `bundle/schema` to a separate library
package. This struct is now reused for validating template configs.

## Tests
Unit tests
2023-08-01 14:09:27 +00:00