Commit Graph

10 Commits

Author SHA1 Message Date
Pieter Noordhuis 4fea0219fd
Use `fs.FS` interface to read template (#1910)
## Changes

While working on the v2 of #1744, I found that:
* Template initialization first copies built-in templates to a temporary
directory before initializing them
* Reading a template's contents goes through a `filer.Filer` but is
hardcoded to a local one

This change updates the interface for reading templates to be `fs.FS`.
This is compatible with the `embed.FS` type for the built-in templates,
so they no longer have to be copied to a temporary directory before
being used.

The alternative is to use a `filer.Filer` throughout, but this would
have required even more plumbing, and we don't need to _read_ templates,
including notebooks, from the workspace filesystem (yet?).

As part of making `template.Materialize` take an `fs.FS` argument, the
logic to match a given argument to a particular built-in template in the
`init` command has moved to sit next to its implementation.

## Tests

Existing tests pass.
2024-11-20 09:28:35 +00:00
shreyas-goenka 28b39cd3f7
Make bundle JSON schema modular with `$defs` (#1700)
## Changes
This PR makes sweeping changes to the way we generate and test the
bundle JSON schema. The main benefits are:

1. More modular JSON schema. Every definition in the schema now is one
level deep and points to references instead of inlining the entire
schema for a field. This unblocks PyDABs from taking a dependency on the
JSON schema.

2. Generate the JSON schema during CLI code generation. Directly stream
it instead of computing it at runtime whenever a user calls `databricks
bundle schema`. This is nice because we no longer need to embed a
partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()`
method to every struct in the Databricks Go SDK and remove the
dependency on the OpenAPI spec altogether. It'll become more important
once we decouple Go SDK structs and methods from the underlying APIs.

3. Add enum values for Go SDK fields in the JSON schema. Better
autocompletion and validation for these fields. As a follow-up, we can
add enum values for non-Go SDK enums as well (created internal ticket to
track).

4. Use "packageName.structName" as a key to read JSON schemas from the
OpenAPI spec for Go SDK structs. Before, we would use an unrolled
presentation of the JSON schema (stored in `bundle_descriptions.json`),
which was complex to parse and include in the final JSON schema output.
This also means loading values from the OpenAPI spec for `target` schema
works automatically and no longer needs custom code.
5. Support recursive types (eg: `for_each_task`). With us now using
$refs everywhere it's trivial to support.
6. Using complex variables would be invalid according to the schema
generated before this PR. Now that bug is fixed. In the future adding
more custom rules will be easier as well due to the single level nature
of the JSON schema.


Since this is a complete change of approach in how we generate the JSON
schema, there are a few (very minor) regressions worth calling out.
1. We'll lose a few custom descriptions for non Go SDK structs that were
a part of `bundle_descriptions.json`. Support for those can be added in
the future as a followup.
2. Since now the final JSON schema is a static artefact, we lose some
lead time for the signal that JSON schema integration tests are failing.
It's okay though since we have a lot of coverage via the existing unit
tests.

## Tests
Unit tests. End to end tests are being added in this PR:
https://github.com/databricks/cli/pull/1726

Previous unit tests were all deleted because they were bloated. Effort
was made to make the new unit tests provide (almost) equivalent
coverage.
2024-09-10 13:55:18 +00:00
shreyas-goenka d949f2b4f2
Fix bundle schema for variables (#1396)
## Changes
This PR fixes the variable schema to:
1. Allow non-string values in the "default" value of a variable.
2. Allow non-string overrides in a target for a variable. 

## Tests
Manually. There are no longer squiggly lines. 

Before:
<img width="329" alt="Screenshot 2024-04-24 at 3 26 43 PM"
src="https://github.com/databricks/cli/assets/88374338/43be02c2-80a4-4f80-bd79-0f3e1e93ee17">


After:
<img width="361" alt="Screenshot 2024-04-24 at 3 26 10 PM"
src="https://github.com/databricks/cli/assets/88374338/2c1fb892-a2a2-478b-8d2e-9bda6d844b54">
2024-04-25 11:23:50 +00:00
shreyas-goenka bdef0f7b23
Add support for conditional prompting in bundle init (#971)
## Changes
This PR introduces the `skip_prompt_if` extension to the jsonschema
library. If the inputs provided by the user match the JSON schema then
the prompt for that property is skipped.

Right now only constant checks are supported, but if in the future more
complicated conditionals are required, this can be extended to support
`allOf`, `oneOf`, `anyOf` etc allowing template authors to specify
conditionals of arbitary complexity.

## Tests
Unit tests and manually.
2023-11-30 16:07:45 +00:00
shreyas-goenka 283f24179d
Remove validation for default value against pattern (#959)
## Changes
This PR removes validation for default value against the regex pattern
specified in a JSON schema at schema load time. This is required because
https://github.com/databricks/cli/pull/795 introduces parameterising the
default value as a Go text template impling that the default value now
does not necessarily have to match the pattern at schema load time.

This will also unblock:
https://github.com/databricks/mlops-stacks/pull/108

Note, this does not remove runtime validation for input parameters right
before template initialization, which happens here:
fb32e78c9b/libs/template/materialize.go (L76)

## Tests
Changes to existing test.
2023-11-07 12:35:59 +00:00
shreyas-goenka 3700785dfa
Add support for validating CLI version when loading a jsonschema object (#883)
## Changes
Updates to bundle templates can require updated versions of the CLI.
This PR extends the JSON schema representation to allow template authors
to set a min CLI version they require for their templates.

This is required to make improvements/additions to the mlops-stacks repo

## Tests
Tested using unit tests and manually. 

For manualy testing, I created a custom build of the CLI using go
releaser and then tested it against a local instance of mlops-stack
When mlops-stack schema has:
```
  "min_databricks_cli_version": "v5000.1.1",
```

output (error as expected)
```
shreyas.goenka@THW32HFW6T bricks % ./dist/cli_darwin_arm64/databricks bundle init  ~/mlops-stack
Error: minimum CLI version "v5000.1.1" is greater than current CLI version "v0.207.2-dev+1b992c0". Please upgrade your current Databricks CLI
```

When the mlops-stack schema has:
```
  "min_databricks_cli_version": "v0.1.1",
```

output (validation passes)
```
shreyas.goenka@THW32HFW6T bricks % ./dist/cli_darwin_arm64/databricks bundle init  ~/mlops-stack
Welcome to MLOps Stack. For detailed information on project generation, see the README at https://github.com/databricks/mlops-stack/blob/main/README.md.

Project Name [my-mlops-project]: ^C
```
2023-10-19 14:01:48 +00:00
shreyas-goenka 757d5efe8d
Add support for regex patterns in template schema (#768)
## Changes
This PR introduces support for regex pattern validation in our custom
jsonschema validator. This allows us to fail early if a user enters an
invalid value for a field.

For example, now this is what initializing the default template looks
like with an invalid project name:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init
Template to use [default-python]: 
Unique name for this project [my_project]: (_*_)
Error: invalid value for project_name: (_*_). Must consist of letter and underscores only.
```

## Tests
New unit tests and manually.
2023-09-25 09:53:38 +00:00
shreyas-goenka 7c96270db8
Add enum support for bundle templates (#668)
## Changes
This PR includes:
1. Adding enum field to the json schema struct
2. Adding prompting logic for enum values. See demo for how it looks
3. Validation rules, validating the default value and config values when
an enum list is specified

This will now enable template authors to use enums for input parameters.

## Tests
Manually and new unit tests
2023-09-08 12:07:22 +00:00
shreyas-goenka 1a7bf4e4f1
Add schema and config validation to jsonschema package (#740)
## Changes

At a high level this PR adds new schema validation and moves
functionality that should be present in the jsonschema package, but
resides in the template package today, to the jsonschema package. This
includes for example schema validation, schema instance validation, to /
from string conversion methods etc.

The list below outlines all the pieces that have been moved over, and
the new validation bits added.

This PR:
1. Adds casting default value of schema properties to integers to the
jsonschema.Load method.
2. Adds validation for default value types for schema properties,
checking they are consistant with the type defined.
3. Introduces the LoadInstance and ValidateInstance methods to the json
schema package. These methods can be used to read and validate JSON
documents against the schema.
4. Replaces validation done for template inputs to use the newly defined
JSON schema validation functions.
5. Moves to/from string and isInteger utility methods to the json schema
package.

## Tests
Existing and new unit tests.
2023-09-07 14:36:06 +00:00
shreyas-goenka 878bb6deae
Return better error messages for invalid JSON schema types in templates (#661)
## Changes
Adds a function to validate json schema types added by the author. The
default json unmarshaller does not validate that the parsed type matches
the enum defined in `jsonschema.Type`

Includes some other improvements to provide better error messages.

This PR was prompted by usability difficulties reported by @mingyu89
during mlops stack migration.

## Tests
Unit tests
2023-08-15 14:28:04 +00:00