databricks-cli/libs/template/templates/dbt-sql/template/{{.project_name}}
shreyas-goenka b0c1c23630
Add `uuid` to builtin templates (#2088)
## Changes
This is useful to track telemetry associated with the templates and can
later be useful for functional usecases as well. Mlops stacks does the
same here: https://github.com/databricks/mlops-stacks/pull/185

## Tests
Existing tests.
2025-01-09 18:19:34 +00:00
..
.vscode Remove unused vscode settings in the templates (#2013) 2024-12-13 16:13:21 +00:00
dbt_profiles Fix non-default project names not working in dbt-sql template (#1500) 2024-07-01 07:52:22 +00:00
resources Add sub-extension to resource files in built-in templates (#1777) 2024-09-25 12:58:14 +00:00
src Fix non-default project names not working in dbt-sql template (#1500) 2024-07-01 07:52:22 +00:00
README.md.tmpl Add sub-extension to resource files in built-in templates (#1777) 2024-09-25 12:58:14 +00:00
databricks.yml.tmpl Add `uuid` to builtin templates (#2088) 2025-01-09 18:19:34 +00:00
dbt_project.yml.tmpl Add an experimental dbt-sql template (#1059) 2024-02-19 09:15:17 +00:00
profile_template.yml.tmpl Add an experimental dbt-sql template (#1059) 2024-02-19 09:15:17 +00:00
requirements-dev.txt Fix non-default project names not working in dbt-sql template (#1500) 2024-07-01 07:52:22 +00:00

README.md.tmpl

# {{.project_name}}

The '{{.project_name}}' project was generated by using the dbt template for
Databricks Asset Bundles. It follows the standard dbt project structure
and has an additional `resources` directory to define Databricks resources such as jobs
that run dbt models.

* Learn more about dbt and its standard project structure here: https://docs.getdbt.com/docs/build/projects.
* Learn more about Databricks Asset Bundles here: https://docs.databricks.com/en/dev-tools/bundles/index.html

The remainder of this file includes instructions for local development (using dbt)
and deployment to production (using Databricks Asset Bundles).

## Development setup

1. Install the Databricks CLI from https://docs.databricks.com/dev-tools/cli/databricks-cli.html

2. Authenticate to your Databricks workspace, if you have not done so already:
    ```
    $ databricks configure
    ```

3. Install dbt

   To install dbt, you need a recent version of Python. For the instructions below,
   we assume `python3` refers to the Python version you want to use. On some systems,
   you may need to refer to a different Python version, e.g. `python` or `/usr/bin/python`.

   Run these instructions from the `{{.project_name}}` directory. We recommend making
   use of a Python virtual environment and installing dbt as follows:

   ```
   $ python3 -m venv .venv
   $ . .venv/bin/activate
   $ pip install -r requirements-dev.txt
   ```

4. Initialize your dbt profile

   Use `dbt init` to initialize your profile.

   ```
   $ dbt init
   ```

   Note that dbt authentication uses personal access tokens by default
   (see https://docs.databricks.com/dev-tools/auth/pat.html).
   You can use OAuth as an alternative, but this currently requires manual configuration.
   See https://github.com/databricks/dbt-databricks/blob/main/docs/oauth.md
   for general instructions, or https://community.databricks.com/t5/technical-blog/using-dbt-core-with-oauth-on-azure-databricks/ba-p/46605
   for advice on setting up OAuth for Azure Databricks.

   To setup up additional profiles, such as a 'prod' profile,
   see https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles.

5. Activate dbt so it can be used from the terminal

   ```
   $ . .venv/bin/activate
    ```

## Local development with dbt

Use `dbt` to [run this project locally using a SQL warehouse](https://docs.databricks.com/partners/prep/dbt.html):

```
$ dbt seed
$ dbt run
```

(Did you get an error that the dbt command could not be found? You may need
to try the last step from the development setup above to re-activate
your Python virtual environment!)


To just evaluate a single model defined in a file called orders.sql, use:

```
$ dbt run --model orders
```

Use `dbt test` to run tests generated from yml files such as `models/schema.yml`
and any SQL tests from `tests/`

```
$ dbt test
```

## Production setup

Your production dbt profiles are defined in dbt_profiles/profiles.yml.
These profiles define the default catalog, schema, and any other
target-specific settings. Read more about dbt profiles on Databricks at
https://docs.databricks.com/en/workflows/jobs/how-to/use-dbt-in-workflows.html#advanced-run-dbt-with-a-custom-profile.

The target workspaces for staging and prod are defined in databricks.yml.
You can manually deploy based on these configurations (see below).
Or you can use CI/CD to automate deployment. See
https://docs.databricks.com/dev-tools/bundles/ci-cd.html for documentation
on CI/CD setup.

## Manually deploying to Databricks with Databricks Asset Bundles

Databricks Asset Bundles can be used to deploy to Databricks and to execute
dbt commands as a job using Databricks Workflows. See
https://docs.databricks.com/dev-tools/bundles/index.html to learn more.

Use the Databricks CLI to deploy a development copy of this project to a workspace:

```
$ databricks bundle deploy --target dev
```

(Note that "dev" is the default target, so the `--target` parameter
is optional here.)

This deploys everything that's defined for this project.
For example, the default template would deploy a job called
`[dev yourname] {{.project_name}}_job` to your workspace.
You can find that job by opening your workpace and clicking on **Workflows**.

You can also deploy to your production target directly from the command-line.
The warehouse, catalog, and schema for that target are configured in databricks.yml.
When deploying to this target, note that the default job at resources/{{.project_name}}.job.yml
has a schedule set that runs every day. The schedule is paused when deploying in development mode
(see https://docs.databricks.com/dev-tools/bundles/deployment-modes.html).

To deploy a production copy, type:

```
$ databricks bundle deploy --target prod
```

## IDE support

Optionally, install developer tools such as the Databricks extension for Visual Studio Code from
https://docs.databricks.com/dev-tools/vscode-ext.html. Third-party extensions
related to dbt may further enhance your dbt development experience!