databricks-cli/libs/cmdio/render_test.go

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

192 lines
4.6 KiB
Go
Raw Normal View History

Use Go SDK Iterators when listing resources with the CLI (#1202) ## Changes Currently, when the CLI run a list API call (like list jobs), it uses the `List*All` methods from the SDK, which list all resources in the collection. This is very slow for large collections: if you need to list all jobs from a workspace that has 10,000+ jobs, you'll be waiting for at least 100 RPCs to complete before seeing any output. Instead of using List*All() methods, the SDK recently added an iterator data structure that allows traversing the collection without needing to completely list it first. New pages are fetched lazily if the next requested item belongs to the next page. Using the List() methods that return these iterators, the CLI can proactively print out some of the response before the complete collection has been fetched. This involves a pretty major rewrite of the rendering logic in `cmdio`. The idea there is to define custom rendering logic based on the type of the provided resource. There are three renderer interfaces: 1. textRenderer: supports printing something in a textual format (i.e. not JSON, and not templated). 2. jsonRenderer: supports printing something in a pretty-printed JSON format. 3. templateRenderer: supports printing something using a text template. There are also three renderer implementations: 1. readerRenderer: supports printing a reader. This only implements the textRenderer interface. 2. iteratorRenderer: supports printing a `listing.Iterator` from the Go SDK. This implements jsonRenderer and templateRenderer, buffering 20 resources at a time before writing them to the output. 3. defaultRenderer: supports printing arbitrary resources (the previous implementation). Callers will either use `cmdio.Render()` for rendering individual resources or `io.Reader` or `cmdio.RenderIterator()` for rendering an iterator. This separate method is needed to safely be able to match on the type of the iterator, since Go does not allow runtime type matches on generic types with an existential type parameter. One other change that needs to happen is to split the templates used for text representation of list resources into a header template and a row template. The template is now executed multiple times for List API calls, but the header should only be printed once. To support this, I have added `headerTemplate` to `cmdIO`, and I have also changed `RenderWithTemplate` to include a `headerTemplate` parameter everywhere. ## Tests - [x] Unit tests for text rendering logic - [x] Unit test for reflection-based iterator construction. --------- Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-02-21 14:16:36 +00:00
package cmdio
import (
"bytes"
"context"
"encoding/json"
"errors"
"fmt"
"strings"
"testing"
"github.com/databricks/cli/libs/flags"
"github.com/databricks/databricks-sdk-go/listing"
"github.com/databricks/databricks-sdk-go/service/provisioning"
"github.com/stretchr/testify/assert"
)
type testCase struct {
name string
v any
outputFormat flags.Output
headerTemplate string
template string
expected string
errMessage string
}
var dummyWorkspace1 = provisioning.Workspace{
WorkspaceId: 123,
WorkspaceName: "abc",
}
var dummyWorkspace2 = provisioning.Workspace{
WorkspaceId: 456,
WorkspaceName: "def",
}
type dummyIterator struct {
items []*provisioning.Workspace
}
func (d *dummyIterator) HasNext(_ context.Context) bool {
return len(d.items) > 0
}
func (d *dummyIterator) Next(ctx context.Context) (*provisioning.Workspace, error) {
if !d.HasNext(ctx) {
return nil, errors.New("no more items")
}
item := d.items[0]
d.items = d.items[1:]
return item, nil
}
func makeWorkspaces(count int) []*provisioning.Workspace {
res := make([]*provisioning.Workspace, 0, count)
next := []*provisioning.Workspace{&dummyWorkspace1, &dummyWorkspace2}
for range count {
Use Go SDK Iterators when listing resources with the CLI (#1202) ## Changes Currently, when the CLI run a list API call (like list jobs), it uses the `List*All` methods from the SDK, which list all resources in the collection. This is very slow for large collections: if you need to list all jobs from a workspace that has 10,000+ jobs, you'll be waiting for at least 100 RPCs to complete before seeing any output. Instead of using List*All() methods, the SDK recently added an iterator data structure that allows traversing the collection without needing to completely list it first. New pages are fetched lazily if the next requested item belongs to the next page. Using the List() methods that return these iterators, the CLI can proactively print out some of the response before the complete collection has been fetched. This involves a pretty major rewrite of the rendering logic in `cmdio`. The idea there is to define custom rendering logic based on the type of the provided resource. There are three renderer interfaces: 1. textRenderer: supports printing something in a textual format (i.e. not JSON, and not templated). 2. jsonRenderer: supports printing something in a pretty-printed JSON format. 3. templateRenderer: supports printing something using a text template. There are also three renderer implementations: 1. readerRenderer: supports printing a reader. This only implements the textRenderer interface. 2. iteratorRenderer: supports printing a `listing.Iterator` from the Go SDK. This implements jsonRenderer and templateRenderer, buffering 20 resources at a time before writing them to the output. 3. defaultRenderer: supports printing arbitrary resources (the previous implementation). Callers will either use `cmdio.Render()` for rendering individual resources or `io.Reader` or `cmdio.RenderIterator()` for rendering an iterator. This separate method is needed to safely be able to match on the type of the iterator, since Go does not allow runtime type matches on generic types with an existential type parameter. One other change that needs to happen is to split the templates used for text representation of list resources into a header template and a row template. The template is now executed multiple times for List API calls, but the header should only be printed once. To support this, I have added `headerTemplate` to `cmdIO`, and I have also changed `RenderWithTemplate` to include a `headerTemplate` parameter everywhere. ## Tests - [x] Unit tests for text rendering logic - [x] Unit test for reflection-based iterator construction. --------- Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-02-21 14:16:36 +00:00
n := next[0]
next = append(next[1:], n)
res = append(res, n)
}
return res
}
func makeIterator(count int) listing.Iterator[*provisioning.Workspace] {
items := make([]*provisioning.Workspace, 0, count)
items = append(items, makeWorkspaces(count)...)
return &dummyIterator{
items: items,
}
}
func makeBigOutput(count int) string {
res := bytes.Buffer{}
for _, ws := range makeWorkspaces(count) {
res.Write([]byte(fmt.Sprintf("%d %s\n", ws.WorkspaceId, ws.WorkspaceName)))
}
return res.String()
}
func must[T any](a T, e error) T {
if e != nil {
panic(e)
}
return a
}
var testCases = []testCase{
{
name: "Workspace with header and template",
v: dummyWorkspace1,
outputFormat: flags.OutputText,
headerTemplate: "id\tname",
template: "{{.WorkspaceId}}\t{{.WorkspaceName}}",
expected: `id name
123 abc`,
},
{
name: "Workspace with no header and template",
v: dummyWorkspace1,
outputFormat: flags.OutputText,
template: "{{.WorkspaceId}}\t{{.WorkspaceName}}",
expected: `123 abc`,
},
{
name: "Workspace with no header and no template",
v: dummyWorkspace1,
outputFormat: flags.OutputText,
expected: `{
"workspace_id":123,
"workspace_name":"abc"
}
`,
},
{
name: "Workspace Iterator with header and template",
v: makeIterator(2),
outputFormat: flags.OutputText,
headerTemplate: "id\tname",
template: "{{range .}}{{.WorkspaceId}}\t{{.WorkspaceName}}\n{{end}}",
expected: `id name
123 abc
456 def
`,
},
{
name: "Workspace Iterator with no header and template",
v: makeIterator(2),
outputFormat: flags.OutputText,
template: "{{range .}}{{.WorkspaceId}}\t{{.WorkspaceName}}\n{{end}}",
expected: `123 abc
456 def
`,
},
{
name: "Workspace Iterator with no header and no template",
v: makeIterator(2),
outputFormat: flags.OutputText,
expected: string(must(json.MarshalIndent(makeWorkspaces(2), "", " "))) + "\n",
},
{
name: "Big Workspace Iterator with template",
v: makeIterator(234),
outputFormat: flags.OutputText,
headerTemplate: "id\tname",
template: "{{range .}}{{.WorkspaceId}}\t{{.WorkspaceName}}\n{{end}}",
expected: "id name\n" + makeBigOutput(234),
},
{
name: "Big Workspace Iterator with no template",
v: makeIterator(234),
outputFormat: flags.OutputText,
expected: string(must(json.MarshalIndent(makeWorkspaces(234), "", " "))) + "\n",
},
{
name: "io.Reader",
v: strings.NewReader("a test"),
outputFormat: flags.OutputText,
expected: "a test",
},
{
name: "io.Reader",
v: strings.NewReader("a test"),
outputFormat: flags.OutputJSON,
errMessage: "json output not supported",
},
}
func TestRender(t *testing.T) {
for _, c := range testCases {
t.Run(c.name, func(t *testing.T) {
output := &bytes.Buffer{}
ctx := context.Background()
cmdIO := NewIO(ctx, c.outputFormat, nil, output, output, c.headerTemplate, c.template)
ctx = InContext(ctx, cmdIO)
Use Go SDK Iterators when listing resources with the CLI (#1202) ## Changes Currently, when the CLI run a list API call (like list jobs), it uses the `List*All` methods from the SDK, which list all resources in the collection. This is very slow for large collections: if you need to list all jobs from a workspace that has 10,000+ jobs, you'll be waiting for at least 100 RPCs to complete before seeing any output. Instead of using List*All() methods, the SDK recently added an iterator data structure that allows traversing the collection without needing to completely list it first. New pages are fetched lazily if the next requested item belongs to the next page. Using the List() methods that return these iterators, the CLI can proactively print out some of the response before the complete collection has been fetched. This involves a pretty major rewrite of the rendering logic in `cmdio`. The idea there is to define custom rendering logic based on the type of the provided resource. There are three renderer interfaces: 1. textRenderer: supports printing something in a textual format (i.e. not JSON, and not templated). 2. jsonRenderer: supports printing something in a pretty-printed JSON format. 3. templateRenderer: supports printing something using a text template. There are also three renderer implementations: 1. readerRenderer: supports printing a reader. This only implements the textRenderer interface. 2. iteratorRenderer: supports printing a `listing.Iterator` from the Go SDK. This implements jsonRenderer and templateRenderer, buffering 20 resources at a time before writing them to the output. 3. defaultRenderer: supports printing arbitrary resources (the previous implementation). Callers will either use `cmdio.Render()` for rendering individual resources or `io.Reader` or `cmdio.RenderIterator()` for rendering an iterator. This separate method is needed to safely be able to match on the type of the iterator, since Go does not allow runtime type matches on generic types with an existential type parameter. One other change that needs to happen is to split the templates used for text representation of list resources into a header template and a row template. The template is now executed multiple times for List API calls, but the header should only be printed once. To support this, I have added `headerTemplate` to `cmdIO`, and I have also changed `RenderWithTemplate` to include a `headerTemplate` parameter everywhere. ## Tests - [x] Unit tests for text rendering logic - [x] Unit test for reflection-based iterator construction. --------- Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
2024-02-21 14:16:36 +00:00
var err error
if vv, ok := c.v.(listing.Iterator[*provisioning.Workspace]); ok {
err = RenderIterator(ctx, vv)
} else {
err = Render(ctx, c.v)
}
if c.errMessage != "" {
assert.ErrorContains(t, err, c.errMessage)
} else {
assert.NoError(t, err)
assert.Equal(t, c.expected, output.String())
}
})
}
}