databricks-cli/integration
shreyas-goenka 41a21af556
Refactor `bundle init` (#2074)
## Summary of changes
This PR introduces three new abstractions: 
1. `Resolver`: Resolves which reader and writer to use for a template.
2. `Writer`: Writes a template project to disk. Prompts the user if
necessary.
3. `Reader`: Reads a template specification from disk, built into the
CLI or from GitHub.

Introducing these abstractions helps decouple reading a template from
writing it. When I tried adding telemetry for the `bundle init` command,
I noticed that the code in `cmd/init.go` was getting convoluted and hard
to test. A future change could have accidentally logged PII when a user
initialised a custom template.

Hedging against that risk is important here because we use a generic
untyped `map<string, string>` representation in the backend to log
telemetry for the `databricks bundle init`. Otherwise, we risk
accidentally breaking our compliance with our centralization
requirements.

### Details

After this PR there are two classes of templates that can be
initialized:
1. A `databricks` template: This could be a builtin template or a
template outside the CLI like mlops-stacks, which is still owned and
managed by Databricks. These templates log their telemetry arguments and
template name.
2. A `custom` template: These are templates created by and managed by
the end user. In these templates we do not log the template name and
args. Instead a generic placeholder string of "custom" is logged in our
telemetry system.

NOTE: The functionality of the `databricks bundle init` command remains
the same after this PR. Only the internal abstractions used are changed.

## Tests
New unit tests. Existing golden and unit tests. Also a fair bit of
manual testing.
2025-01-20 12:09:28 +00:00
..
assumptions Clean up TestMain from integration tests to fix caching (#2090) 2025-01-08 11:59:22 +00:00
bundle Refactor `bundle init` (#2074) 2025-01-20 12:09:28 +00:00
cmd Enable linter 'copyloopvar' and fix the issues (#2160) 2025-01-16 11:20:50 +00:00
internal/acc Clean up TestMain from integration tests to fix caching (#2090) 2025-01-08 11:59:22 +00:00
libs Enable linter 'copyloopvar' and fix the issues (#2160) 2025-01-16 11:20:50 +00:00
python Clean up TestMain from integration tests to fix caching (#2090) 2025-01-08 11:59:22 +00:00
README.md Move integration tests to `integration` package (#2009) 2024-12-13 15:38:58 +01:00

README.md

Integration tests

This directory contains integration tests for the project.

The tree structure generally mirrors the source code tree structure.

Requirements for new files in this directory:

  • Every package must be named after its directory with _test appended
    • Requiring a different package name for integration tests avoids aliasing with the main package.
  • Every integration test package must include a main_test.go file.

These requirements are enforced by a unit test in this directory.

Running integration tests

Integration tests require the following environment variables:

  • CLOUD_ENV - set to the cloud environment to use (e.g. aws, azure, gcp)
  • DATABRICKS_HOST - set to the Databricks workspace to use
  • DATABRICKS_TOKEN - set to the Databricks token to use

Optional environment variables:

  • TEST_DEFAULT_WAREHOUSE_ID - set to the default warehouse ID to use
  • TEST_METASTORE_ID - set to the metastore ID to use
  • TEST_INSTANCE_POOL_ID - set to the instance pool ID to use
  • TEST_BRICKS_CLUSTER_ID - set to the cluster ID to use

To run all integration tests, use the following command:

go test ./integration/...

Alternatively:

make integration