databricks-cli/cmd/auth/profiles.go

136 lines
3.1 KiB
Go
Raw Normal View History

package auth
import (
"context"
"errors"
"fmt"
"io/fs"
"sync"
"time"
"github.com/databricks/cli/libs/cmdio"
Improve token refresh flow (#1434) ## Changes Currently, there are a number of issues with the non-happy-path flows for token refresh in the CLI. If the token refresh fails, the raw error message is presented to the user, as seen below. This message is very difficult for users to interpret and doesn't give any clear direction on how to resolve this issue. ``` Error: token refresh: Post "https://adb-<WSID>.azuredatabricks.net/oidc/v1/token": http 400: {"error":"invalid_request","error_description":"Refresh token is invalid"} ``` When logging in again, I've noticed that the timeout for logging in is very short, only 45 seconds. If a user is using a password manager and needs to login to that first, or needs to do MFA, 45 seconds may not be enough time. to an account-level profile, it is quite frustrating for users to need to re-enter account ID information when that information is already stored in the user's `.databrickscfg` file. This PR tackles these two issues. First, the presentation of error messages from `databricks auth token` is improved substantially by converting the `error` into a human-readable message. When the refresh token is invalid, it will present a command for the user to run to reauthenticate. If the token fetching failed for some other reason, that reason will be presented in a nice way, providing front-line debugging steps and ultimately redirecting users to file a ticket at this repo if they can't resolve the issue themselves. After this PR, the new error message is: ``` Error: a new access token could not be retrieved because the refresh token is invalid. To reauthenticate, run `.databricks/databricks auth login --host https://adb-<WSID>.azuredatabricks.net` ``` To improve the login flow, this PR modifies `databricks auth login` to auto-complete the account ID from the profile when present. Additionally, it increases the login timeout from 45 seconds to 1 hour to give the user sufficient time to login as needed. To test this change, I needed to refactor some components of the CLI around profile management, the token cache, and the API client used to fetch OAuth tokens. These are now settable in the context, and a demonstration of how they can be set and used is found in `auth_test.go`. Separately, this also demonstrates a sort-of integration test of the CLI by executing the Cobra command for `databricks auth token` from tests, which may be useful for testing other end-to-end functionality in the CLI. In particular, I believe this is necessary in order to set flag values (like the `--profile` flag in this case) for use in testing. ## Tests Unit tests cover the unhappy and happy paths using the mocked API client, token cache, and profiler. Manually tested --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-05-16 10:22:09 +00:00
"github.com/databricks/cli/libs/databrickscfg/profile"
"github.com/databricks/cli/libs/log"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/config"
"github.com/spf13/cobra"
"gopkg.in/ini.v1"
)
type profileMetadata struct {
Name string `json:"name"`
Host string `json:"host,omitempty"`
AccountID string `json:"account_id,omitempty"`
Cloud string `json:"cloud"`
AuthType string `json:"auth_type"`
Valid bool `json:"valid"`
}
func (c *profileMetadata) IsEmpty() bool {
return c.Host == "" && c.AccountID == ""
}
func (c *profileMetadata) Load(ctx context.Context, configFilePath string, skipValidate bool) {
cfg := &config.Config{
Loaders: []config.Loader{config.ConfigFile},
ConfigFile: configFilePath,
Profile: c.Name,
}
_ = cfg.EnsureResolved()
if cfg.IsAws() {
c.Cloud = "aws"
} else if cfg.IsAzure() {
c.Cloud = "azure"
} else if cfg.IsGcp() {
c.Cloud = "gcp"
}
if skipValidate {
c.Host = cfg.CanonicalHostName()
c.AuthType = cfg.AuthType
return
}
if cfg.IsAccountClient() {
a, err := databricks.NewAccountClient((*databricks.Config)(cfg))
if err != nil {
return
}
_, err = a.Workspaces.List(ctx)
c.Host = cfg.Host
c.AuthType = cfg.AuthType
if err != nil {
return
}
c.Valid = true
} else {
w, err := databricks.NewWorkspaceClient((*databricks.Config)(cfg))
if err != nil {
return
}
_, err = w.CurrentUser.Me(ctx)
c.Host = cfg.Host
c.AuthType = cfg.AuthType
if err != nil {
return
}
c.Valid = true
}
}
func newProfilesCommand() *cobra.Command {
cmd := &cobra.Command{
Use: "profiles",
Short: "Lists profiles from ~/.databrickscfg",
Annotations: map[string]string{
"template": cmdio.Heredoc(`
{{header "Name"}} {{header "Host"}} {{header "Valid"}}
{{range .Profiles}}{{.Name | green}} {{.Host|cyan}} {{bool .Valid}}
{{end}}`),
},
}
var skipValidate bool
cmd.Flags().BoolVar(&skipValidate, "skip-validate", false, "Whether to skip validating the profiles")
cmd.RunE = func(cmd *cobra.Command, args []string) error {
var profiles []*profileMetadata
Improve token refresh flow (#1434) ## Changes Currently, there are a number of issues with the non-happy-path flows for token refresh in the CLI. If the token refresh fails, the raw error message is presented to the user, as seen below. This message is very difficult for users to interpret and doesn't give any clear direction on how to resolve this issue. ``` Error: token refresh: Post "https://adb-<WSID>.azuredatabricks.net/oidc/v1/token": http 400: {"error":"invalid_request","error_description":"Refresh token is invalid"} ``` When logging in again, I've noticed that the timeout for logging in is very short, only 45 seconds. If a user is using a password manager and needs to login to that first, or needs to do MFA, 45 seconds may not be enough time. to an account-level profile, it is quite frustrating for users to need to re-enter account ID information when that information is already stored in the user's `.databrickscfg` file. This PR tackles these two issues. First, the presentation of error messages from `databricks auth token` is improved substantially by converting the `error` into a human-readable message. When the refresh token is invalid, it will present a command for the user to run to reauthenticate. If the token fetching failed for some other reason, that reason will be presented in a nice way, providing front-line debugging steps and ultimately redirecting users to file a ticket at this repo if they can't resolve the issue themselves. After this PR, the new error message is: ``` Error: a new access token could not be retrieved because the refresh token is invalid. To reauthenticate, run `.databricks/databricks auth login --host https://adb-<WSID>.azuredatabricks.net` ``` To improve the login flow, this PR modifies `databricks auth login` to auto-complete the account ID from the profile when present. Additionally, it increases the login timeout from 45 seconds to 1 hour to give the user sufficient time to login as needed. To test this change, I needed to refactor some components of the CLI around profile management, the token cache, and the API client used to fetch OAuth tokens. These are now settable in the context, and a demonstration of how they can be set and used is found in `auth_test.go`. Separately, this also demonstrates a sort-of integration test of the CLI by executing the Cobra command for `databricks auth token` from tests, which may be useful for testing other end-to-end functionality in the CLI. In particular, I believe this is necessary in order to set flag values (like the `--profile` flag in this case) for use in testing. ## Tests Unit tests cover the unhappy and happy paths using the mocked API client, token cache, and profiler. Manually tested --------- Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-05-16 10:22:09 +00:00
iniFile, err := profile.DefaultProfiler.Get(cmd.Context())
if errors.Is(err, fs.ErrNotExist) {
// return empty list for non-configured machines
iniFile = &config.File{
File: &ini.File{},
}
} else if err != nil {
return fmt.Errorf("cannot parse config file: %w", err)
}
var wg sync.WaitGroup
for _, v := range iniFile.Sections() {
hash := v.KeysHash()
profile := &profileMetadata{
Name: v.Name(),
Host: hash["host"],
AccountID: hash["account_id"],
}
if profile.IsEmpty() {
continue
}
wg.Add(1)
go func() {
ctx := cmd.Context()
t := time.Now()
profile.Load(ctx, iniFile.Path(), skipValidate)
log.Debugf(ctx, "Profile %q took %s to load", profile.Name, time.Since(t))
wg.Done()
}()
profiles = append(profiles, profile)
}
wg.Wait()
Added OpenAPI command coverage (#357) This PR adds the following command groups: ## Workspace-level command groups * `bricks alerts` - The alerts API can be used to perform CRUD operations on alerts. * `bricks catalogs` - A catalog is the first layer of Unity Catalog’s three-level namespace. * `bricks cluster-policies` - Cluster policy limits the ability to configure clusters based on a set of rules. * `bricks clusters` - The Clusters API allows you to create, start, edit, list, terminate, and delete clusters. * `bricks current-user` - This API allows retrieving information about currently authenticated user or service principal. * `bricks dashboards` - In general, there is little need to modify dashboards using the API. * `bricks data-sources` - This API is provided to assist you in making new query objects. * `bricks experiments` - MLflow Experiment tracking. * `bricks external-locations` - An external location is an object that combines a cloud storage path with a storage credential that authorizes access to the cloud storage path. * `bricks functions` - Functions implement User-Defined Functions (UDFs) in Unity Catalog. * `bricks git-credentials` - Registers personal access token for Databricks to do operations on behalf of the user. * `bricks global-init-scripts` - The Global Init Scripts API enables Workspace administrators to configure global initialization scripts for their workspace. * `bricks grants` - In Unity Catalog, data is secure by default. * `bricks groups` - Groups simplify identity management, making it easier to assign access to Databricks Workspace, data, and other securable objects. * `bricks instance-pools` - Instance Pools API are used to create, edit, delete and list instance pools by using ready-to-use cloud instances which reduces a cluster start and auto-scaling times. * `bricks instance-profiles` - The Instance Profiles API allows admins to add, list, and remove instance profiles that users can launch clusters with. * `bricks ip-access-lists` - IP Access List enables admins to configure IP access lists. * `bricks jobs` - The Jobs API allows you to create, edit, and delete jobs. * `bricks libraries` - The Libraries API allows you to install and uninstall libraries and get the status of libraries on a cluster. * `bricks metastores` - A metastore is the top-level container of objects in Unity Catalog. * `bricks model-registry` - MLflow Model Registry commands. * `bricks permissions` - Permissions API are used to create read, write, edit, update and manage access for various users on different objects and endpoints. * `bricks pipelines` - The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines. * `bricks policy-families` - View available policy families. * `bricks providers` - Databricks Providers REST API. * `bricks queries` - These endpoints are used for CRUD operations on query definitions. * `bricks query-history` - Access the history of queries through SQL warehouses. * `bricks recipient-activation` - Databricks Recipient Activation REST API. * `bricks recipients` - Databricks Recipients REST API. * `bricks repos` - The Repos API allows users to manage their git repos. * `bricks schemas` - A schema (also called a database) is the second layer of Unity Catalog’s three-level namespace. * `bricks secrets` - The Secrets API allows you to manage secrets, secret scopes, and access permissions. * `bricks service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms. * `bricks serving-endpoints` - The Serving Endpoints API allows you to create, update, and delete model serving endpoints. * `bricks shares` - Databricks Shares REST API. * `bricks storage-credentials` - A storage credential represents an authentication and authorization mechanism for accessing data stored on your cloud tenant. * `bricks table-constraints` - Primary key and foreign key constraints encode relationships between fields in tables. * `bricks tables` - A table resides in the third layer of Unity Catalog’s three-level namespace. * `bricks token-management` - Enables administrators to get all tokens and delete tokens for other users. * `bricks tokens` - The Token API allows you to create, list, and revoke tokens that can be used to authenticate and access Databricks REST APIs. * `bricks users` - User identities recognized by Databricks and represented by email addresses. * `bricks volumes` - Volumes are a Unity Catalog (UC) capability for accessing, storing, governing, organizing and processing files. * `bricks warehouses` - A SQL warehouse is a compute resource that lets you run SQL commands on data objects within Databricks SQL. * `bricks workspace` - The Workspace API allows you to list, import, export, and delete notebooks and folders. * `bricks workspace-conf` - This API allows updating known workspace settings for advanced users. ## Account-level command groups * `bricks account billable-usage` - This API allows you to download billable usage logs for the specified account and date range. * `bricks account budgets` - These APIs manage budget configuration including notifications for exceeding a budget for a period. * `bricks account credentials` - These APIs manage credential configurations for this workspace. * `bricks account custom-app-integration` - These APIs enable administrators to manage custom oauth app integrations, which is required for adding/using Custom OAuth App Integration like Tableau Cloud for Databricks in AWS cloud. * `bricks account encryption-keys` - These APIs manage encryption key configurations for this workspace (optional). * `bricks account groups` - Groups simplify identity management, making it easier to assign access to Databricks Account, data, and other securable objects. * `bricks account ip-access-lists` - The Accounts IP Access List API enables account admins to configure IP access lists for access to the account console. * `bricks account log-delivery` - These APIs manage log delivery configurations for this account. * `bricks account metastore-assignments` - These APIs manage metastore assignments to a workspace. * `bricks account metastores` - These APIs manage Unity Catalog metastores for an account. * `bricks account networks` - These APIs manage network configurations for customer-managed VPCs (optional). * `bricks account o-auth-enrollment` - These APIs enable administrators to enroll OAuth for their accounts, which is required for adding/using any OAuth published/custom application integration. * `bricks account private-access` - These APIs manage private access settings for this account. * `bricks account published-app-integration` - These APIs enable administrators to manage published oauth app integrations, which is required for adding/using Published OAuth App Integration like Tableau Cloud for Databricks in AWS cloud. * `bricks account service-principals` - Identities for use with jobs, automated tools, and systems such as scripts, apps, and CI/CD platforms. * `bricks account storage` - These APIs manage storage configurations for this workspace. * `bricks account storage-credentials` - These APIs manage storage credentials for a particular metastore. * `bricks account users` - User identities recognized by Databricks and represented by email addresses. * `bricks account vpc-endpoints` - These APIs manage VPC endpoint configurations for this account. * `bricks account workspace-assignment` - The Workspace Permission Assignment API allows you to manage workspace permissions for principals in your account. * `bricks account workspaces` - These APIs manage workspaces for this account.
2023-04-26 11:06:16 +00:00
return cmdio.Render(cmd.Context(), struct {
Profiles []*profileMetadata `json:"profiles"`
}{profiles})
}
return cmd
}