Commit Graph

30 Commits

Author SHA1 Message Date
Lennart Kats 480d308a80
Merge remote-tracking branch 'databricks/main' into presets-catalog-schema 2024-11-02 09:25:15 +01:00
Lennart Kats (databricks) 60c153c0e7
Fix pipeline in default-python template not working for certain workspaces (#1854)
Change the default-python template to not set the `catalog` field for
the pipeline for workspaces that set `hive_metastore` as the default
catalog. The Pipelines service currently returns an error when that
value is used for the `catalog` field.

This is the most simple fix for this issue, which was reported by a
customer. As a followup, we should look at whether we want to prompt for
a catalog instead, possibly just for this specific scenario.
2024-10-22 15:52:46 +00:00
Lennart Kats 6a7d31fb27
Add proof of concept for catalog/schema presets 2024-10-20 08:37:23 +02:00
Andrew Nester a8cff48c0b
Always prepend bundle remote paths with /Workspace (#1724)
## Changes
Due to platform changes, all libraries, notebooks and etc. paths used in
Databricks must be started with either /Workspace or /Volumes prefix.

This PR makes sure that all bundle paths are correctly prefixed.

Note: this change is a breaking change if user previously configured and
used `/Workspace/Workspace` folder in their workspace file system or
having `/Workspace/${workspace.root_path}...` pattern configured
anywhere in their bundle config

Fixes: #1751

AI:
- [x] Scan DABs config and error out on
`/Workspace/${workspace.root_path}...` pattern usage

## Tests
Added unit tests

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-10-02 15:34:00 +00:00
shreyas-goenka a4ba0bbe9f
Add sub-extension to resource files in built-in templates (#1777)
## Changes
We want to encourage a pattern of only specifying a single resource in a
YAML file when an `.<resource-type>.yml` (like `.job.yml`) is used. This
convention could allow us to bijectively map a resource YAML file to
it's corresponding resource in the Databricks workspace.

This PR simply makes the built-in templates compliant to this format.

## Tests
Existing tests.
2024-09-25 12:58:14 +00:00
Lennart Kats (databricks) 7665c639bd
Use Unity Catalog for pipelines in the default-python template (#1766)
## Summary

Enables Unity Catalog for pipelines in the default template. Pipelines
will default to non-Unity Catalog pipelines if a catalog is not
specified.

*Small caveat*: there are cases where admins lock down the default
catalog of a workspace and don't allow the creation of a new schema
there. If that happens, the pipeline would fail at runtime with a clear
error indicating what happened. ("PERMISSION_DENIED: User does not have
CREATE SCHEMA on Catalog 'main'."). I've seen this with an internal
Databricks workspace, where creating new non-UC schemas wasn't locked
down, but creation in the `main` was.

## Testing

- Validated on a non-UC + UC workspace. The catalog selection logic here
is the same as applied for the SQL templates.
2024-09-23 09:52:04 +00:00
Lennart Kats (databricks) f2dee890b8
Use periodic triggers in all templates (#1739)
## Summary

Simplifies template by using the periodic trigger syntax instead of the
cron schedule syntax. Periodic triggers are simpler to configure,
simpler to read, and make sure that workloads are spread out through the
day. We only recommend cron syntax for advanced cases or when more
control is needed.

## Testing

* Templates validation via unit tests
* Manual validation that the new triggers work as expected in dev/prod
2024-09-12 08:33:00 +00:00
Lennart Kats (databricks) 072fa812e2
Include a permissions section in all templates (#1713)
## Changes

This updates the templates to include a `permissions` section. Having a
permissions section is a best practice, is helpful to understand the
notion of permissions, and helps diagnose permission errors
(https://github.com/databricks/cli/pull/1386).

This is a cherry-pick from https://github.com/databricks/cli/pull/1387.

This change was verified to work both in dev and prod. Existing unit
tests validate the validity of the templates in these modes.
2024-09-03 07:51:54 +00:00
Lennart Kats (databricks) ed4a4585c0
Update templates to latest LTS DBR (#1715)
## Changes

This updates the templates to use the latest DBR 15 LTS version.

- [x] DB Connect 15.4 must be released + validated before this can go
out
2024-08-30 07:32:10 +00:00
Fabian Jakobs e61f0e1eb9
Fix DBConnect support in VS Code (#1253)
## Changes

With the current template, we can't execute the Python file and the jobs
notebook using DBConnect from VSCode because we import `from pyspark.sql
import SparkSession`, which doesn't support Databricks unified auth.
This PR fixes this by passing spark into the library code and by
explicitly instantiating a spark session where the spark global is not
available.

Other changes:

* add auto-reload to notebooks
* add DLT typings for code completion
2024-03-05 14:31:27 +00:00
Lennart Kats (databricks) 162b115e19
Add an experimental default-sql template (#1051)
## Changes

This adds a `default-sql` template! 

In this latest revision, I've hidden the new template from the list so
we can merge it, iterate over it, and properly release the template at
the right time.

- [x] WorkspaceFS support for .sql files is in prod
- [x] SQL extension is preconfigured based on extension settings (if
possible)
- [ ] Streaming tables support is either ungated or the template
provides instructions about signup
- _Mitigation for now: this template is hidden from the list of
templates._
- [x] Support non-UC workspaces

## Tests
- [x] Unit tests
- [x] Manual testing
- [x] More manual testing
- [x] Reviewer testing

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
2024-02-19 12:01:11 +00:00
Lennart Kats (databricks) 1c680121c8
Add an experimental dbt-sql template (#1059)
## Changes

This adds a new dbt-sql template. This work requires the new WorkspaceFS
support for dbt tasks.

In this latest revision, I've hidden the new template from the list so
we can merge it, iterate over it, and propertly release the template at
the right time.

Blockers:
- [x] WorkspaceFS support for dbt projects is in prod
- [x] Move dbt files into a subdirectory
- [ ] Wait until the next (>1.7.4) release of the dbt plugin which will
have major improvements!
- _Rather than wait, this template is hidden from the list of
templates._
- [x] SQL extension is preconfigured based on extension settings (if
possible)
- MV / streaming tables:
  - [x] Add to template
- [x] Fix https://github.com/databricks/dbt-databricks/issues/535 (to be
released with in 1.7.4)
- [x] Merge https://github.com/databricks/dbt-databricks/pull/338 (to be
released with in 1.7.4)
- [ ] Fix "too many 503 errors" issue
(https://github.com/databricks/dbt-databricks/issues/570, internal
tracker: ES-1009215, ES-1014138)
  - [x] Support ANSI mode in the template
- [ ] Streaming tables support is either ungated or the template
provides instructions about signup
- _Mitigation for now: this template is hidden from the list of
templates._
- [x] Support non-workspace-admin deployment
- [x] Make sure `data_security_mode: SINGLE_USER` works on non-UC
workspaces (it's required to be explicitly specified on UC workspaces
with single-node clusters)
- [x] Support non-UC workspaces

## Tests

- [x] Unit tests
- [x] Manual testing
- [x] More manual testing
- [ ] Reviewer manual testing
  - _I'd like to do a small bug bash post-merging._
- [x] Unit tests
2024-02-19 09:15:17 +00:00
Lennart Kats (databricks) 167deec8c3
Change recommended production deployment path from /Shared to /Users (#1091)
## Changes

This PR changes the default and `mode: production` recommendation to
target `/Users` for deployment. Previously, we used `/Shared`, but
because of a lack of POSIX-like permissions in WorkspaceFS this meant
that files inside would be readable and writable by other users in the
workspace.

Detailed change:
* `default-python` no longer uses a path that starts with `/Shared`
* `mode: production` no longer requires a path that starts with
`/Shared`
 
## Related PRs

Docs: https://github.com/databricks/docs/pull/14585
Examples: https://github.com/databricks/bundle-examples/pull/17

## Tests

* Manual tests
* Template unit tests (with an extra check to avoid /Shared)
2024-01-02 19:58:24 +00:00
Lennart Kats (databricks) 8b9930a49a
Improve default template (#1046)
## Changes
- Tweak strings, documentation in template
- Extend requirements-dev.txt with setuptools/wheel for building whl
files
- Clarify what the "_job.yml" file is for for users who are only
interested in DLT pipelines (answering a question that came up recently)

## Tests
Existing tests exercise this template
2023-12-11 19:13:14 +00:00
Andrew Nester cdf29da27b
Change default_python template to auto-update version on each wheel build (#1034)
## Changes
Change default_python template to auto-update version on each wheel
build
2023-12-01 13:24:55 +00:00
Lennart Kats (databricks) 92539d4b9b
Work around DLT issue with `$PYTHONPATH` not being set correctly (#999)
## Changes

DLT currently doesn't always set `$PYTHONPATH` correctly (ES-947370).
This restores the original workaround to make new pipelines work while
that issue is being addressed. The workaround was removed in #832.

Manually tested.
2023-11-20 19:25:43 +00:00
shreyas-goenka a5815a0b47
Add welcome message to bundle templates (#907)
## Changes
Adds a welcome_message field to templates and the default python
template.

## Tests
Manually.

Here's the output logs during template init now:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init                              
Template to use [default-python]: 

Welcome to the sample Databricks Asset Bundle template! Please enter the following information to initialize your sample DAB.

Unique name for this project [my_project]: abcde
Include a stub (sample) notebook in 'abcde/src': no
Include a stub (sample) Delta Live Tables pipeline in 'abcde/src': yes
Include a stub (sample) Python package in 'abcde/src': no

 Your new project has been created in the 'abcde' directory!

Please refer to the README.md of your project for further instructions on getting started.
Or read the documentation on Databricks Asset Bundles at https://docs.databricks.com/dev-tools/bundles/index.html.
```
2023-10-25 12:27:25 +00:00
Lennart Kats (databricks) a2ee8bb45b
Improve the output of the `databricks bundle init` command (#795)
Improve the output of help, prompts, and so on for `databricks bundle
init` and the default template.

Among other things, this PR adds support for a new `welcome_message`
property that lets a template print a custom message on success:

```
$ databricks bundle init
Template to use [default-python]:
Unique name for this project [my_project]: lennart_project
Include a stub (sample) notebook in 'lennart_project/src': yes
Include a stub (sample) Delta Live Tables pipeline in 'lennart_project/src': yes
Include a stub (sample) Python package in 'lennart_project/src': yes

 Your new project has been created in the 'lennart_project' directory!

Please refer to the README.md of your project for further instructions on getting started.
Or read the documentation on Databricks Asset Bundles at https://docs.databricks.com/dev-tools/bundles/index.html.
```

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2023-10-19 07:08:36 +00:00
Lennart Kats (databricks) 0894584132
Minor template tweaks (#832)
## Changes

Minor tweaks to the template.
2023-10-04 15:27:09 +00:00
Lennart Kats (databricks) 0c1516c4ba
Make the default `databricks bundle init` template more self-explanatory (#796)
This makes the default-python template more self-explanatory and adds a
few other tweaks for a better out-of-the-box experience.
2023-09-26 09:12:34 +00:00
shreyas-goenka 757d5efe8d
Add support for regex patterns in template schema (#768)
## Changes
This PR introduces support for regex pattern validation in our custom
jsonschema validator. This allows us to fail early if a user enters an
invalid value for a field.

For example, now this is what initializing the default template looks
like with an invalid project name:
```
shreyas.goenka@THW32HFW6T bricks % cli bundle init
Template to use [default-python]: 
Unique name for this project [my_project]: (_*_)
Error: invalid value for project_name: (_*_). Must consist of letter and underscores only.
```

## Tests
New unit tests and manually.
2023-09-25 09:53:38 +00:00
shreyas-goenka be55310cc9
Use enums for default python template (#765)
## Changes
This PR changes schema to use the enum type for the default template
yes/no questions.

## Tests
Manually
2023-09-13 17:57:31 +00:00
Lennart Kats (databricks) a4e94e1b36
Fix author in setup.py (#761)
Fix author in setup.py showing <no value>
2023-09-11 08:59:48 +00:00
Lennart Kats (databricks) 9e56bed593
Minor default template tweaks (#758)
Minor template tweaks, mostly making the imports section for DLT
notebooks a bit more elegant.

Tested with DAB deployment + in-workspace UI.
2023-09-11 07:36:44 +00:00
shreyas-goenka d9a276b17d
Fix minor typos in default-python template (#754)
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-09 21:55:43 +00:00
Lennart Kats (databricks) 50b2c0b83b
Fix notebook showing up in template when not selected (#743)
## Changes
This fixes a typo that caused the notebook.ipynb file to show up even if
the user answered "no" to the question about including a notebook.

## Tests
We have matrix validation tests for all the yes/no combinations and
whether the build + validate. There is no current test for the absence
of files.
2023-09-07 08:26:43 +00:00
Lennart Kats (databricks) 3c79181148
Remove unused file (#742)
defaults.json was originally used in tests. It's no longer used and
should be removed.
2023-09-06 18:18:15 +00:00
Lennart Kats (databricks) f9e521b43e
databricks bundle init template v2: optional stubs, DLT support (#700)
## Changes

This follows up on https://github.com/databricks/cli/pull/686. This PR
makes our stubs optional + it adds DLT stubs:

```
$ databricks bundle init
Template to use [default-python]: default-python
Unique name for this project [my_project]: my_project
Include a stub (sample) notebook in 'my_project/src' [yes]: yes
Include a stub (sample) DLT pipeline in 'my_project/src' [yes]: yes
Include a stub (sample) Python package 'my_project/src' [yes]: yes
 Successfully initialized template
```

## Tests
Manual testing, matrix tests.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-06 09:52:31 +00:00
Lennart Kats (databricks) 8c2cc07f7b
databricks bundle init template v1 (#686)
## Changes

This adds a built-in "default-python" template to the CLI. This is based
on the new default-template support of
https://github.com/databricks/cli/pull/685.

The goal here is to offer an experience where customers can simply type
`databricks bundle init` to get a default template:

```
$ databricks bundle init
Template to use [default-python]: default-python
Unique name for this project [my_project]: my_project
 Successfully initialized template
```

The present template:
- [x] Works well with VS Code
- [x] Works well with the workspace
- [x] Works well with DB Connect
- [x] Uses minimal stubs rather than boiler-plate-heavy examples

I'll have a followup with tests + DLT support.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
Co-authored-by: PaulCornellDB <paul.cornell@databricks.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2023-09-05 11:58:34 +00:00
Lennart Kats (databricks) a5b86093ec
Add a foundation for built-in templates (#685)
## Changes

This pull request extends the templating support in preparation of a
new, default template (WIP, https://github.com/databricks/cli/pull/686):
* builtin templates that can be initialized using e.g. `databricks
bundle init default-python`
* builtin templates are embedded into the executable using go's `embed`
functionality, making sure they're co-versioned with the CLI
* new helpers to get the workspace name, current user name, etc. help
craft a complete template
* (not enabled yet) when the user types `databricks bundle init` they
can interactively select the `default-python` template

And makes two tangentially related changes:
* IsServicePrincipal now uses the "users" API rather than the
"principals" API, since the latter is too slow for our purposes.
* mode: prod no longer requires the 'target.prod.git' setting. It's hard
to set that from a template. (Pieter is planning an overhaul of warnings
support; this would be one of the first warnings we show.)

The actual `default-python` template is maintained in a separate PR:
https://github.com/databricks/cli/pull/686

## Tests
Unit tests, manual testing
2023-08-25 09:03:42 +00:00