databricks-cli

Commit Graph

Author	SHA1	Message	Date
Ilya Kuznetsov	042c8d88c6	Custom annotations for bundle-specific JSON schema fields (#1957 ) ## Changes Adds annotations to json-schema for fields which are not covered by OpenAPI spec. Custom descriptions were copy-pasted from documentation PR which is still WIP so descriptions for some fields are missing Further improvements: * documentation autogen based on json-schema * fix missing descriptions ## Tests This script is not part of CLI package so I didn't test all corner cases. Few high-level tests were added to be sure that schema annotations is in sync with actual config --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>	2024-12-18 10:19:14 +00:00
Pieter Noordhuis	14fe03dcb9	Breakout variable lookup into separate files and tests (#1921 ) ## Changes While looking into adding variable lookups for notification destinations ([API][API]), I found the codegen approach for different classes of variable lookups a bit complex. The template had a custom field override (for service principals), the package had an override for the cluster lookup, and it didn't produce tests. The notification destinations API uses a default page size of 20 for listing. I want to use a larger page size to limit the number of API calls, so that would imply another customization on the template or a manual override. This code being rather mechanical, I used copilot to produce all instances of the resolvers and their tests (after writing one of them manually). [api]: https://docs.databricks.com/api/workspace/notificationdestinations ## Tests * Unit tests pass * Manual confirmation that lookups of warehouses still work	2024-11-21 11:28:50 +01:00
shreyas-goenka	ab20624206	Assert SDK version is consistent in the CLI generation process (#1814 ) ## Changes Followup from https://github.com/databricks/cli/pull/1809#discussion_r1790045086. User will see the following error if the SDK version does not match during CLI code generation. ``` --- FAIL: TestConsistentDatabricksSdkVersion (1.34s) info_test.go:53: Error Trace: /Users/shreyas.goenka/cli/internal/build/info_test.go:53 Error: Not equal: expected: "0c86ea6dbd9a730c24ff0d4e509603e476955ac5" actual : "6f6b1371e640f2dfeba72d365ac566368656f6b6" Diff: --- Expected +++ Actual @@ -1 +1 @@ -0c86ea6dbd9a730c24ff0d4e509603e476955ac5 +6f6b1371e640f2dfeba72d365ac566368656f6b6 Test: TestConsistentDatabricksSdkVersion Messages: please update the SDK version before generating the CLI ``` ## Tests Manually asserted that: 1. Generating the CLI without the SDK bump causes an error. 2. Generating the CLI after the SDK bump works as expected. 3. The test works even when the SDK is pinned to a specific commit like `v0.47.1-0.20241002195128-6cecc224cbf7`	2024-10-14 16:19:48 +00:00
shreyas-goenka	28b39cd3f7	Make bundle JSON schema modular with `$defs` (#1700 ) ## Changes This PR makes sweeping changes to the way we generate and test the bundle JSON schema. The main benefits are: 1. More modular JSON schema. Every definition in the schema now is one level deep and points to references instead of inlining the entire schema for a field. This unblocks PyDABs from taking a dependency on the JSON schema. 2. Generate the JSON schema during CLI code generation. Directly stream it instead of computing it at runtime whenever a user calls `databricks bundle schema`. This is nice because we no longer need to embed a partial OpenAPI spec in the CLI. Down the line, we can add a `Schema()` method to every struct in the Databricks Go SDK and remove the dependency on the OpenAPI spec altogether. It'll become more important once we decouple Go SDK structs and methods from the underlying APIs. 3. Add enum values for Go SDK fields in the JSON schema. Better autocompletion and validation for these fields. As a follow-up, we can add enum values for non-Go SDK enums as well (created internal ticket to track). 4. Use "packageName.structName" as a key to read JSON schemas from the OpenAPI spec for Go SDK structs. Before, we would use an unrolled presentation of the JSON schema (stored in `bundle_descriptions.json`), which was complex to parse and include in the final JSON schema output. This also means loading values from the OpenAPI spec for `target` schema works automatically and no longer needs custom code. 5. Support recursive types (eg: `for_each_task`). With us now using $refs everywhere it's trivial to support. 6. Using complex variables would be invalid according to the schema generated before this PR. Now that bug is fixed. In the future adding more custom rules will be easier as well due to the single level nature of the JSON schema. Since this is a complete change of approach in how we generate the JSON schema, there are a few (very minor) regressions worth calling out. 1. We'll lose a few custom descriptions for non Go SDK structs that were a part of `bundle_descriptions.json`. Support for those can be added in the future as a followup. 2. Since now the final JSON schema is a static artefact, we lose some lead time for the signal that JSON schema integration tests are failing. It's okay though since we have a lot of coverage via the existing unit tests. ## Tests Unit tests. End to end tests are being added in this PR: https://github.com/databricks/cli/pull/1726 Previous unit tests were all deleted because they were bloated. Effort was made to make the new unit tests provide (almost) equivalent coverage.	2024-09-10 13:55:18 +00:00
Andrew Nester	5fb40f9d07	Allow referencing bundle resources by name (#872 ) ## Changes Now we can define variables with values which reference different Databricks resources by name. When references like this, DABs automatically looks up the resource by this name and replaces the reference with ID of the resource referenced. Thus when the variable is used in the configuration it will contain the correct resolved ID of resource. The resolvers are code generated and thus DABs support referencing all resources which has `GetByName`-like methods in Go SDK. ### Example ``` variables: my_cluster_id: description: An existing cluster. lookup: cluster: "12.2 shared" resources: jobs: my_job: name: "My Job" tasks: - task_key: TestTask existing_cluster_id: ${var.my_cluster_id} targets: dev: variables: my_cluster_id: lookup: cluster: "dev-cluster" ``` ## Tests Added unit test + manual testing --------- Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>	2024-01-04 21:04:42 +00:00
shreyas-goenka	6002f49c87	Move bundle schema update to an internal module (#1012 ) ## Changes This PR: 1. Move code to load bundle JSON Schema descriptions from the OpenAPI spec to an internal Go module 2. Remove command line flags from the `bundle schema` command. These flags were meant for internal processes and at no point were meant for customer use. 3. Regenerate `bundle_descriptions.json` 4. Add support for `bundle: "deprecated"`. The `environments` field is tagged as deprecated in this PR and consequently will no longer be a part of the bundle schema. ## Tests Tested by regenerating the CLI against its current OpenAPI spec (as defined in `__openapi_sha`). The `bundle_descriptions.json` in this PR was generated from the code generator. Manually checked that the autocompletion / descriptions from the new bundle schema are correct.	2023-12-06 10:45:18 +00:00
shreyas-goenka	5f88af54fd	Revert automation for bundle schema documentation generation (#1018 ) ## Changes Introduced in #1007 but doesn't work well yet. This will be automated again as part of #1012.	2023-11-29 09:53:07 +00:00
shreyas-goenka	96e9545cf0	Automate the generation of bundle schema descriptions (#1007 ) ## Changes This PR makes changes required to automatically update the bundle docs during the CLI release process. We rely on `post_generate` scripts that are executed after code generation with CWD as the CLI repo root. The new `output-file` flag is introduced because stdout redirect does not work here and would otherwise require changes to our release automation CLI (deco CLI) ## Tests Manually. Regenerated the CLI and the descriptions were indeed generated for the CLI from the provided openapi spec.	2023-11-27 10:42:39 +00:00
Serge Smertin	ff98096208	Integrate with auto-release infra (#581 ) ## Changes - added changelog template - added `toolchain` to `.codegen.json` ## Tests none	2023-07-18 17:48:35 +02:00
Serge Smertin	4c4a293015	Added OpenAPI command coverage (#357 ) This PR adds the following command groups: ## Workspace-level command groups * `bricks alerts` - The alerts API can be used to perform CRUD operations on alerts. * `bricks catalogs` - A catalog is the first layer of Unity Catalog’s three-level namespace. * `bricks cluster-policies` - Cluster policy limits the ability to configure clusters based on a set of rules. * `bricks clusters` - The Clusters API allows you to create, start, edit, list, terminate, and delete clusters. * `bricks current-user` - This API allows retrieving information about currently authenticated user or service principal. * `bricks dashboards` - In general, there is little need to modify dashboards using the API. * `bricks data-sources` - This API is provided to assist you in making new query objects. * `bricks experiments` - MLflow Experiment tracking. * `bricks external-locations` - An external location is an object that combines a cloud storage path with a storage credential that authorizes access to the cloud storage path. * `bricks functions` - Functions implement User-Defined Functions (UDFs) in Unity Catalog. * `bricks git-credentials` - Registers personal access token for Databricks to do operations on behalf of the user. * `bricks global-init-scripts` - The Global Init Scripts API enables Workspace administrators to configure global initialization scripts for their workspace. * `bricks grants` - In Unity Catalog, data is secure by default. * `bricks groups` - Groups simplify identity management, making it easier to assign access to Databricks Workspace, data, and other securable objects. * `bricks instance-pools` - Instance Pools API are used to create, edit, delete and list instance pools by using ready-to-use cloud instances which reduces a cluster start and auto-scaling times. * `bricks instance-profiles` - The Instance Profiles API allows admins to add, list, and remove instance profiles that users can launch clusters with. * `bricks ip-access-lists` - IP Access List enables admins to configure IP access lists. * `bricks jobs` - The Jobs API allows you to create, edit, and delete jobs. * `bricks libraries` - The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster. * `bricks metastores` - A metastore is the top-level container of objects in Unity Catalog. * `bricks model-registry` - MLflow Model Registry commands. * `bricks permissions` - Permissions API are used to create read, write, edit, update and manage access for various users on different objects and endpoints. * `bricks pipelines` - The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines. * `bricks policy-families` - View available policy families. * `bricks providers` - Databricks Providers REST API. * `bricks queries` - These endpoints are used for CRUD operations on query definitions. * `bricks query-history` - Access the history of queries through SQL warehouses. * `bricks recipient-activation` - Databricks Recipient Activation REST API. * `bricks recipients` - Databricks Recipients REST API. * `bricks repos` - The Repos API allows users to manage their git repos. * `bricks schemas` - A schema (also called a database) is the second layer of Unity Catalog’s three-level namespace. * `bricks secrets` - The Secrets API allows you to manage secrets, secret scopes, and access permissions. * `bricks service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms. * `bricks serving-endpoints` - The Serving Endpoints API allows you to create, update, and delete model serving endpoints. * `bricks shares` - Databricks Shares REST API. * `bricks storage-credentials` - A storage credential represents an authentication and authorization mechanism for accessing data stored on your cloud tenant. * `bricks table-constraints` - Primary key and foreign key constraints encode relationships between fields in tables. * `bricks tables` - A table resides in the third layer of Unity Catalog’s three-level namespace. * `bricks token-management` - Enables administrators to get all tokens and delete tokens for other users. * `bricks tokens` - The Token API allows you to create, list, and revoke tokens that can be used to authenticate and access Databricks REST APIs. * `bricks users` - User identities recognized by Databricks and represented by email addresses. * `bricks volumes` - Volumes are a Unity Catalog (UC) capability for accessing, storing, governing, organizing and processing files. * `bricks warehouses` - A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL. * `bricks workspace` - The Workspace API allows you to list, import, export, and delete notebooks and folders. * `bricks workspace-conf` - This API allows updating known workspace settings for advanced users. ## Account-level command groups * `bricks account billable-usage` - This API allows you to download billable usage logs for the specified account and date range. * `bricks account budgets` - These APIs manage budget configuration including notifications for exceeding a budget for a period. * `bricks account credentials` - These APIs manage credential configurations for this workspace. * `bricks account custom-app-integration` - These APIs enable administrators to manage custom oauth app integrations, which is required for adding/using Custom OAuth App Integration like Tableau Cloud for Databricks in AWS cloud. * `bricks account encryption-keys` - These APIs manage encryption key configurations for this workspace (optional). * `bricks account groups` - Groups simplify identity management, making it easier to assign access to Databricks Account, data, and other securable objects. * `bricks account ip-access-lists` - The Accounts IP Access List API enables account admins to configure IP access lists for access to the account console. * `bricks account log-delivery` - These APIs manage log delivery configurations for this account. * `bricks account metastore-assignments` - These APIs manage metastore assignments to a workspace. * `bricks account metastores` - These APIs manage Unity Catalog metastores for an account. * `bricks account networks` - These APIs manage network configurations for customer-managed VPCs (optional). * `bricks account o-auth-enrollment` - These APIs enable administrators to enroll OAuth for their accounts, which is required for adding/using any OAuth published/custom application integration. * `bricks account private-access` - These APIs manage private access settings for this account. * `bricks account published-app-integration` - These APIs enable administrators to manage published oauth app integrations, which is required for adding/using Published OAuth App Integration like Tableau Cloud for Databricks in AWS cloud. * `bricks account service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms. * `bricks account storage` - These APIs manage storage configurations for this workspace. * `bricks account storage-credentials` - These APIs manage storage credentials for a particular metastore. * `bricks account users` - User identities recognized by Databricks and represented by email addresses. * `bricks account vpc-endpoints` - These APIs manage VPC endpoint configurations for this account. * `bricks account workspace-assignment` - The Workspace Permission Assignment API allows you to manage workspace permissions for principals in your account. * `bricks account workspaces` - These APIs manage workspaces for this account.	2023-04-26 13:06:16 +02:00

10 Commits