Compare commits

...

8 Commits

Author SHA1 Message Date
shreyas-goenka dab88497be
Merge be2d802d13 into 984c38e03e 2024-11-20 22:12:16 +05:30
shreyas-goenka 984c38e03e
Add unique ID to `root_path` for bundle integration test fixtures (#1917)
## Changes
Integration tests using these fixtures could have been flaky when run in
parallel using the same user's identity. They would also possibly have
piggybacked state from previous runs.

This PR adds a UUID to the root_path to force independent bundle
deployments for every test run.

I have checked that all bundles in `internal/bundle/bundles` have
`root_path` namespaced to a UUID.

## Tests
Self testing.
2024-11-20 16:30:10 +00:00
Pieter Noordhuis ade95d9649
[Release] Release v0.235.0 (#1918)
**Note:** the `bundle generate` command now uses the
`.<resource-type>.yml`
sub-extension for the configuration files it writes. Existing
configuration
files that do not use this sub-extension are renamed to include it.

Bundles:
* Make `TableName` field part of quality monitor schema
([#1903](https://github.com/databricks/cli/pull/1903)).
* Do not prepend paths starting with ~ or variable reference
([#1905](https://github.com/databricks/cli/pull/1905)).
* Fix workspace extensions filer accidentally reading notebooks
([#1891](https://github.com/databricks/cli/pull/1891)).
* Fix template initialization when running on Databricks
([#1912](https://github.com/databricks/cli/pull/1912)).
* Source-linked deployments for bundles in the workspace
([#1884](https://github.com/databricks/cli/pull/1884)).
* Added integration test to deploy bundle to /Shared root path
([#1914](https://github.com/databricks/cli/pull/1914)).
* Update filenames used by bundle generate to use `.<resource-type>.yml`
([#1901](https://github.com/databricks/cli/pull/1901)).

Internal:
* Extract functionality to detect if the CLI is running on DBR
([#1889](https://github.com/databricks/cli/pull/1889)).
* Consolidate test helpers for `io/fs`
([#1906](https://github.com/databricks/cli/pull/1906)).
* Use `fs.FS` interface to read template
([#1910](https://github.com/databricks/cli/pull/1910)).
* Use `filer.Filer` to write template instantiation
([#1911](https://github.com/databricks/cli/pull/1911)).
2024-11-20 14:48:18 +00:00
Andrew Nester 592e1111b7
Update filenames used by bundle generate to use `.<resource-type>.yml` (#1901)
## Changes
Update filenames used by bundle generate to use '.resource-type.yml'

Similar to [Add sub-extension to resource files in built-in templates by
shreyas-goenka · Pull Request #1777 ·
databricks/cli](https://github.com/databricks/cli/pull/1777)

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-11-20 13:53:25 +01:00
Andrew Nester fab3e8f168
Added integration test to deploy bundle to /Shared root path (#1914)
## Changes
Added integration test to deploy bundle to /Shared root path

## Tests
```
--- PASS: TestAccDeployBasicToSharedWorkspace (24.58s)
PASS
coverage: 31.2% of statements in ./...
ok  	github.com/databricks/cli/internal/bundle	25.572s	coverage: 31.2% of statements in ./...
```

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-11-20 12:20:39 +00:00
Ilya Kuznetsov 756e55fabc
Source-linked deployments for bundles in the workspace (#1884)
## Changes

This change adds a preset for source-linked deployments. It is enabled
by default for targets in `development` mode **if** the Databricks CLI
is running from the `/Workspace` directory on DBR. It does not have an
effect when running the CLI anywhere else.

Key highlights:
1. Files in this mode won't be uploaded to workspace
2. Created resources will use references to source files instead of
their workspace copies

## Tests
1. Apply preset unit test covering conditional logic
2. High-level process target mode unit test for testing integration
between mutators

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-11-20 13:22:27 +01:00
Pieter Noordhuis 886e14910c
Fix template initialization when running on Databricks (#1912)
## Changes

When running the CLI on Databricks Runtime (DBR), use the
extension-aware filer to write an instantiated template if the instance
path is located in the workspace filesystem.

Notebooks cannot be written through the workspace filesystem's FUSE
mount. As a result, this is the only method for initializing templates
that contain notebooks when running the CLI on DBR and writing to the
workspace filesystem.

Depends on #1910 and #1911.

Supersedes #1744.

## Tests

* Manually confirmed I can initialize a template with notebooks when
running the CLI from the web terminal.
2024-11-20 11:42:23 +00:00
Shreyas Goenka be2d802d13
Skip prefixes for schema names when catalog is already namespaced to current user 2024-10-14 16:43:58 +02:00
24 changed files with 640 additions and 24 deletions

View File

@ -1,5 +1,28 @@
# Version changelog # Version changelog
## [Release] Release v0.235.0
**Note:** the `bundle generate` command now uses the `.<resource-type>.yml`
sub-extension for the configuration files it writes. Existing configuration
files that do not use this sub-extension are renamed to include it.
Bundles:
* Make `TableName` field part of quality monitor schema ([#1903](https://github.com/databricks/cli/pull/1903)).
* Do not prepend paths starting with ~ or variable reference ([#1905](https://github.com/databricks/cli/pull/1905)).
* Fix workspace extensions filer accidentally reading notebooks ([#1891](https://github.com/databricks/cli/pull/1891)).
* Fix template initialization when running on Databricks ([#1912](https://github.com/databricks/cli/pull/1912)).
* Source-linked deployments for bundles in the workspace ([#1884](https://github.com/databricks/cli/pull/1884)).
* Added integration test to deploy bundle to /Shared root path ([#1914](https://github.com/databricks/cli/pull/1914)).
* Update filenames used by bundle generate to use `.<resource-type>.yml` ([#1901](https://github.com/databricks/cli/pull/1901)).
Internal:
* Extract functionality to detect if the CLI is running on DBR ([#1889](https://github.com/databricks/cli/pull/1889)).
* Consolidate test helpers for `io/fs` ([#1906](https://github.com/databricks/cli/pull/1906)).
* Use `fs.FS` interface to read template ([#1910](https://github.com/databricks/cli/pull/1910)).
* Use `filer.Filer` to write template instantiation ([#1911](https://github.com/databricks/cli/pull/1911)).
## [Release] Release v0.234.0 ## [Release] Release v0.234.0
Bundles: Bundles:

View File

@ -9,8 +9,10 @@ import (
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config" "github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/diag" "github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn" "github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/log"
"github.com/databricks/cli/libs/textutil" "github.com/databricks/cli/libs/textutil"
"github.com/databricks/databricks-sdk-go/service/catalog" "github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/jobs" "github.com/databricks/databricks-sdk-go/service/jobs"
@ -188,6 +190,14 @@ func (m *applyPresets) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnos
diags = diags.Extend(diag.Errorf("schema %s is not defined", key)) diags = diags.Extend(diag.Errorf("schema %s is not defined", key))
continue continue
} }
// If the catalog is already namespaced to the current user, we don't need
// to prefix the schema name since it already falls under the user's namespace.
if containsUserIdentity(s.CatalogName, b.Config.Workspace.CurrentUser) {
log.Debugf(ctx, "Skipping schema %s since catalog %s already contains the user's identity", s.Name, s.CatalogName)
continue
}
s.Name = normalizePrefix(prefix) + s.Name s.Name = normalizePrefix(prefix) + s.Name
// HTTP API for schemas doesn't yet support tags. It's only supported in // HTTP API for schemas doesn't yet support tags. It's only supported in
// the Databricks UI and via the SQL API. // the Databricks UI and via the SQL API.
@ -221,6 +231,15 @@ func (m *applyPresets) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnos
dashboard.DisplayName = prefix + dashboard.DisplayName dashboard.DisplayName = prefix + dashboard.DisplayName
} }
if config.IsExplicitlyEnabled((b.Config.Presets.SourceLinkedDeployment)) {
isDatabricksWorkspace := dbr.RunsOnRuntime(ctx) && strings.HasPrefix(b.SyncRootPath, "/Workspace/")
if !isDatabricksWorkspace {
disabled := false
b.Config.Presets.SourceLinkedDeployment = &disabled
diags = diags.Extend(diag.Warningf("source-linked deployment is available only in the Databricks Workspace"))
}
}
return diags return diags
} }

View File

@ -2,13 +2,16 @@ package mutator_test
import ( import (
"context" "context"
"runtime"
"testing" "testing"
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config" "github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/mutator" "github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/config/resources" "github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/databricks-sdk-go/service/catalog" "github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/iam"
"github.com/databricks/databricks-sdk-go/service/jobs" "github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/require" "github.com/stretchr/testify/require"
) )
@ -96,12 +99,53 @@ func TestApplyPresetsPrefixForUcSchema(t *testing.T) {
}, },
want: "schema1", want: "schema1",
}, },
{
name: "skip prefix because catalog contains short name",
prefix: "[prefix]",
schema: &resources.Schema{
CreateSchema: &catalog.CreateSchema{
Name: "schema1",
CatalogName: "dev_john_smith_test_catalog",
},
},
want: "schema1",
},
{
name: "skip prefix because catalog contains email",
prefix: "[prefix]",
schema: &resources.Schema{
CreateSchema: &catalog.CreateSchema{
Name: "schema1",
CatalogName: "dev_john.smith@databricks.com_test_catalog",
},
},
want: "schema1",
},
{
name: "add prefix because catalog is not namespaced to user",
prefix: "[prefix]",
schema: &resources.Schema{
CreateSchema: &catalog.CreateSchema{
Name: "schema1",
CatalogName: "test_catalog",
},
},
want: "prefix_schema1",
},
} }
for _, tt := range tests { for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) { t.Run(tt.name, func(t *testing.T) {
b := &bundle.Bundle{ b := &bundle.Bundle{
Config: config.Root{ Config: config.Root{
Workspace: config.Workspace{
CurrentUser: &config.User{
ShortName: "john_smith",
User: &iam.User{
UserName: "john.smith@databricks.com",
},
},
},
Resources: config.Resources{ Resources: config.Resources{
Schemas: map[string]*resources.Schema{ Schemas: map[string]*resources.Schema{
"schema1": tt.schema, "schema1": tt.schema,
@ -364,3 +408,86 @@ func TestApplyPresetsResourceNotDefined(t *testing.T) {
}) })
} }
} }
func TestApplyPresetsSourceLinkedDeployment(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("this test is not applicable on Windows because source-linked mode works only in the Databricks Workspace")
}
testContext := context.Background()
enabled := true
disabled := false
workspacePath := "/Workspace/user.name@company.com"
tests := []struct {
bundlePath string
ctx context.Context
name string
initialValue *bool
expectedValue *bool
expectedWarning string
}{
{
name: "preset enabled, bundle in Workspace, databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, true),
initialValue: &enabled,
expectedValue: &enabled,
},
{
name: "preset enabled, bundle not in Workspace, databricks runtime",
bundlePath: "/Users/user.name@company.com",
ctx: dbr.MockRuntime(testContext, true),
initialValue: &enabled,
expectedValue: &disabled,
expectedWarning: "source-linked deployment is available only in the Databricks Workspace",
},
{
name: "preset enabled, bundle in Workspace, not databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, false),
initialValue: &enabled,
expectedValue: &disabled,
expectedWarning: "source-linked deployment is available only in the Databricks Workspace",
},
{
name: "preset disabled, bundle in Workspace, databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, true),
initialValue: &disabled,
expectedValue: &disabled,
},
{
name: "preset nil, bundle in Workspace, databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, true),
initialValue: nil,
expectedValue: nil,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
b := &bundle.Bundle{
SyncRootPath: tt.bundlePath,
Config: config.Root{
Presets: config.Presets{
SourceLinkedDeployment: tt.initialValue,
},
},
}
diags := bundle.Apply(tt.ctx, b, mutator.ApplyPresets())
if diags.HasError() {
t.Fatalf("unexpected error: %v", diags)
}
if tt.expectedWarning != "" {
require.Equal(t, tt.expectedWarning, diags[0].Summary)
}
require.Equal(t, tt.expectedValue, b.Config.Presets.SourceLinkedDeployment)
})
}
}

View File

@ -6,6 +6,7 @@ import (
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config" "github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/diag" "github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn" "github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/iamutil" "github.com/databricks/cli/libs/iamutil"
@ -57,12 +58,24 @@ func transformDevelopmentMode(ctx context.Context, b *bundle.Bundle) {
t.TriggerPauseStatus = config.Paused t.TriggerPauseStatus = config.Paused
} }
if !config.IsExplicitlyDisabled(t.SourceLinkedDeployment) {
isInWorkspace := strings.HasPrefix(b.SyncRootPath, "/Workspace/")
if isInWorkspace && dbr.RunsOnRuntime(ctx) {
enabled := true
t.SourceLinkedDeployment = &enabled
}
}
if !config.IsExplicitlyDisabled(t.PipelinesDevelopment) { if !config.IsExplicitlyDisabled(t.PipelinesDevelopment) {
enabled := true enabled := true
t.PipelinesDevelopment = &enabled t.PipelinesDevelopment = &enabled
} }
} }
func containsUserIdentity(s string, u *config.User) bool {
return strings.Contains(s, u.ShortName) || strings.Contains(s, u.UserName)
}
func validateDevelopmentMode(b *bundle.Bundle) diag.Diagnostics { func validateDevelopmentMode(b *bundle.Bundle) diag.Diagnostics {
var diags diag.Diagnostics var diags diag.Diagnostics
p := b.Config.Presets p := b.Config.Presets
@ -92,7 +105,7 @@ func validateDevelopmentMode(b *bundle.Bundle) diag.Diagnostics {
diags = diags.Extend(diag.Errorf("%s must start with '~/' or contain the current username to ensure uniqueness when using 'mode: development'", path)) diags = diags.Extend(diag.Errorf("%s must start with '~/' or contain the current username to ensure uniqueness when using 'mode: development'", path))
} }
} }
if p.NamePrefix != "" && !strings.Contains(p.NamePrefix, u.ShortName) && !strings.Contains(p.NamePrefix, u.UserName) { if p.NamePrefix != "" && !containsUserIdentity(p.NamePrefix, u) {
// Resources such as pipelines require a unique name, e.g. '[dev steve] my_pipeline'. // Resources such as pipelines require a unique name, e.g. '[dev steve] my_pipeline'.
// For this reason we require the name prefix to contain the current username; // For this reason we require the name prefix to contain the current username;
// it's a pitfall for users if they don't include it and later find out that // it's a pitfall for users if they don't include it and later find out that

View File

@ -3,14 +3,17 @@ package mutator
import ( import (
"context" "context"
"reflect" "reflect"
"runtime"
"strings" "strings"
"testing" "testing"
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config" "github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/resources" "github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/diag" "github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/tags" "github.com/databricks/cli/libs/tags"
"github.com/databricks/cli/libs/vfs"
sdkconfig "github.com/databricks/databricks-sdk-go/config" sdkconfig "github.com/databricks/databricks-sdk-go/config"
"github.com/databricks/databricks-sdk-go/service/catalog" "github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/compute" "github.com/databricks/databricks-sdk-go/service/compute"
@ -140,6 +143,7 @@ func mockBundle(mode config.Mode) *bundle.Bundle {
}, },
}, },
}, },
SyncRoot: vfs.MustNew("/Users/lennart.kats@databricks.com"),
// Use AWS implementation for testing. // Use AWS implementation for testing.
Tagging: tags.ForCloud(&sdkconfig.Config{ Tagging: tags.ForCloud(&sdkconfig.Config{
Host: "https://company.cloud.databricks.com", Host: "https://company.cloud.databricks.com",
@ -522,3 +526,32 @@ func TestPipelinesDevelopmentDisabled(t *testing.T) {
assert.False(t, b.Config.Resources.Pipelines["pipeline1"].PipelineSpec.Development) assert.False(t, b.Config.Resources.Pipelines["pipeline1"].PipelineSpec.Development)
} }
func TestSourceLinkedDeploymentEnabled(t *testing.T) {
b, diags := processSourceLinkedBundle(t, true)
require.NoError(t, diags.Error())
assert.True(t, *b.Config.Presets.SourceLinkedDeployment)
}
func TestSourceLinkedDeploymentDisabled(t *testing.T) {
b, diags := processSourceLinkedBundle(t, false)
require.NoError(t, diags.Error())
assert.False(t, *b.Config.Presets.SourceLinkedDeployment)
}
func processSourceLinkedBundle(t *testing.T, presetEnabled bool) (*bundle.Bundle, diag.Diagnostics) {
if runtime.GOOS == "windows" {
t.Skip("this test is not applicable on Windows because source-linked mode works only in the Databricks Workspace")
}
b := mockBundle(config.Development)
workspacePath := "/Workspace/lennart@company.com/"
b.SyncRootPath = workspacePath
b.Config.Presets.SourceLinkedDeployment = &presetEnabled
ctx := dbr.MockRuntime(context.Background(), true)
m := bundle.Seq(ProcessTargetMode(), ApplyPresets())
diags := bundle.Apply(ctx, b, m)
return b, diags
}

View File

@ -11,6 +11,7 @@ import (
"strings" "strings"
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/diag" "github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn" "github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/notebook" "github.com/databricks/cli/libs/notebook"
@ -103,8 +104,13 @@ func (t *translateContext) rewritePath(
return fmt.Errorf("path %s is not contained in sync root path", localPath) return fmt.Errorf("path %s is not contained in sync root path", localPath)
} }
// Prefix remote path with its remote root path. var workspacePath string
remotePath := path.Join(t.b.Config.Workspace.FilePath, filepath.ToSlash(localRelPath)) if config.IsExplicitlyEnabled(t.b.Config.Presets.SourceLinkedDeployment) {
workspacePath = t.b.SyncRootPath
} else {
workspacePath = t.b.Config.Workspace.FilePath
}
remotePath := path.Join(workspacePath, filepath.ToSlash(localRelPath))
// Convert local path into workspace path via specified function. // Convert local path into workspace path via specified function.
interp, err := fn(*p, localPath, localRelPath, remotePath) interp, err := fn(*p, localPath, localRelPath, remotePath)

View File

@ -4,6 +4,7 @@ import (
"context" "context"
"os" "os"
"path/filepath" "path/filepath"
"runtime"
"strings" "strings"
"testing" "testing"
@ -787,3 +788,163 @@ func TestTranslatePathWithComplexVariables(t *testing.T) {
b.Config.Resources.Jobs["job"].Tasks[0].Libraries[0].Whl, b.Config.Resources.Jobs["job"].Tasks[0].Libraries[0].Whl,
) )
} }
func TestTranslatePathsWithSourceLinkedDeployment(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("this test is not applicable on Windows because source-linked mode works only in the Databricks Workspace")
}
dir := t.TempDir()
touchNotebookFile(t, filepath.Join(dir, "my_job_notebook.py"))
touchNotebookFile(t, filepath.Join(dir, "my_pipeline_notebook.py"))
touchEmptyFile(t, filepath.Join(dir, "my_python_file.py"))
touchEmptyFile(t, filepath.Join(dir, "dist", "task.jar"))
touchEmptyFile(t, filepath.Join(dir, "requirements.txt"))
enabled := true
b := &bundle.Bundle{
SyncRootPath: dir,
SyncRoot: vfs.MustNew(dir),
Config: config.Root{
Workspace: config.Workspace{
FilePath: "/bundle",
},
Resources: config.Resources{
Jobs: map[string]*resources.Job{
"job": {
JobSettings: &jobs.JobSettings{
Tasks: []jobs.Task{
{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "my_job_notebook.py",
},
Libraries: []compute.Library{
{Whl: "./dist/task.whl"},
},
},
{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "/Users/jane.doe@databricks.com/absolute_remote.py",
},
},
{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "my_job_notebook.py",
},
Libraries: []compute.Library{
{Requirements: "requirements.txt"},
},
},
{
SparkPythonTask: &jobs.SparkPythonTask{
PythonFile: "my_python_file.py",
},
},
{
SparkJarTask: &jobs.SparkJarTask{
MainClassName: "HelloWorld",
},
Libraries: []compute.Library{
{Jar: "./dist/task.jar"},
},
},
{
SparkJarTask: &jobs.SparkJarTask{
MainClassName: "HelloWorldRemote",
},
Libraries: []compute.Library{
{Jar: "dbfs:/bundle/dist/task_remote.jar"},
},
},
},
},
},
},
Pipelines: map[string]*resources.Pipeline{
"pipeline": {
PipelineSpec: &pipelines.PipelineSpec{
Libraries: []pipelines.PipelineLibrary{
{
Notebook: &pipelines.NotebookLibrary{
Path: "my_pipeline_notebook.py",
},
},
{
Notebook: &pipelines.NotebookLibrary{
Path: "/Users/jane.doe@databricks.com/absolute_remote.py",
},
},
{
File: &pipelines.FileLibrary{
Path: "my_python_file.py",
},
},
},
},
},
},
},
Presets: config.Presets{
SourceLinkedDeployment: &enabled,
},
},
}
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
require.NoError(t, diags.Error())
// updated to source path
assert.Equal(
t,
filepath.Join(dir, "my_job_notebook"),
b.Config.Resources.Jobs["job"].Tasks[0].NotebookTask.NotebookPath,
)
assert.Equal(
t,
filepath.Join(dir, "requirements.txt"),
b.Config.Resources.Jobs["job"].Tasks[2].Libraries[0].Requirements,
)
assert.Equal(
t,
filepath.Join(dir, "my_python_file.py"),
b.Config.Resources.Jobs["job"].Tasks[3].SparkPythonTask.PythonFile,
)
assert.Equal(
t,
filepath.Join(dir, "my_pipeline_notebook"),
b.Config.Resources.Pipelines["pipeline"].Libraries[0].Notebook.Path,
)
assert.Equal(
t,
filepath.Join(dir, "my_python_file.py"),
b.Config.Resources.Pipelines["pipeline"].Libraries[2].File.Path,
)
// left as is
assert.Equal(
t,
filepath.Join("dist", "task.whl"),
b.Config.Resources.Jobs["job"].Tasks[0].Libraries[0].Whl,
)
assert.Equal(
t,
"/Users/jane.doe@databricks.com/absolute_remote.py",
b.Config.Resources.Jobs["job"].Tasks[1].NotebookTask.NotebookPath,
)
assert.Equal(
t,
filepath.Join("dist", "task.jar"),
b.Config.Resources.Jobs["job"].Tasks[4].Libraries[0].Jar,
)
assert.Equal(
t,
"dbfs:/bundle/dist/task_remote.jar",
b.Config.Resources.Jobs["job"].Tasks[5].Libraries[0].Jar,
)
assert.Equal(
t,
"/Users/jane.doe@databricks.com/absolute_remote.py",
b.Config.Resources.Pipelines["pipeline"].Libraries[1].Notebook.Path,
)
}

View File

@ -17,6 +17,11 @@ type Presets struct {
// JobsMaxConcurrentRuns is the default value for the max concurrent runs of jobs. // JobsMaxConcurrentRuns is the default value for the max concurrent runs of jobs.
JobsMaxConcurrentRuns int `json:"jobs_max_concurrent_runs,omitempty"` JobsMaxConcurrentRuns int `json:"jobs_max_concurrent_runs,omitempty"`
// SourceLinkedDeployment indicates whether source-linked deployment is enabled. Works only in Databricks Workspace
// When set to true, resources created during deployment will point to source files in the workspace instead of their workspace copies.
// File synchronization to ${workspace.file_path} is skipped.
SourceLinkedDeployment *bool `json:"source_linked_deployment,omitempty"`
// Tags to add to all resources. // Tags to add to all resources.
Tags map[string]string `json:"tags,omitempty"` Tags map[string]string `json:"tags,omitempty"`
} }

View File

@ -7,6 +7,7 @@ import (
"io/fs" "io/fs"
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/permissions" "github.com/databricks/cli/bundle/permissions"
"github.com/databricks/cli/libs/cmdio" "github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/diag" "github.com/databricks/cli/libs/diag"
@ -23,6 +24,11 @@ func (m *upload) Name() string {
} }
func (m *upload) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics { func (m *upload) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
if config.IsExplicitlyEnabled(b.Config.Presets.SourceLinkedDeployment) {
cmdio.LogString(ctx, "Source-linked deployment is enabled. Deployed resources reference the source files in your working tree instead of separate copies.")
return nil
}
cmdio.LogString(ctx, fmt.Sprintf("Uploading bundle files to %s...", b.Config.Workspace.FilePath)) cmdio.LogString(ctx, fmt.Sprintf("Uploading bundle files to %s...", b.Config.Workspace.FilePath))
opts, err := GetSyncOptions(ctx, bundle.ReadOnly(b)) opts, err := GetSyncOptions(ctx, bundle.ReadOnly(b))
if err != nil { if err != nil {

View File

@ -6,6 +6,7 @@ import (
"strings" "strings"
"github.com/databricks/cli/bundle" "github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/libraries" "github.com/databricks/cli/bundle/libraries"
"github.com/databricks/cli/libs/diag" "github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/log" "github.com/databricks/cli/libs/log"
@ -22,6 +23,9 @@ func WrapperWarning() bundle.Mutator {
func (m *wrapperWarning) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics { func (m *wrapperWarning) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
if isPythonWheelWrapperOn(b) { if isPythonWheelWrapperOn(b) {
if config.IsExplicitlyEnabled(b.Config.Presets.SourceLinkedDeployment) {
return diag.Warningf("Python wheel notebook wrapper is not available when using source-linked deployment mode. You can disable this mode by setting 'presets.source_linked_deployment: false'")
}
return nil return nil
} }

View File

@ -3,8 +3,10 @@ package generate
import ( import (
"bytes" "bytes"
"context" "context"
"errors"
"fmt" "fmt"
"io" "io"
"io/fs"
"os" "os"
"path/filepath" "path/filepath"
"testing" "testing"
@ -90,7 +92,7 @@ func TestGeneratePipelineCommand(t *testing.T) {
err := cmd.RunE(cmd, []string{}) err := cmd.RunE(cmd, []string{})
require.NoError(t, err) require.NoError(t, err)
data, err := os.ReadFile(filepath.Join(configDir, "test_pipeline.yml")) data, err := os.ReadFile(filepath.Join(configDir, "test_pipeline.pipeline.yml"))
require.NoError(t, err) require.NoError(t, err)
require.Equal(t, fmt.Sprintf(`resources: require.Equal(t, fmt.Sprintf(`resources:
pipelines: pipelines:
@ -186,7 +188,123 @@ func TestGenerateJobCommand(t *testing.T) {
err := cmd.RunE(cmd, []string{}) err := cmd.RunE(cmd, []string{})
require.NoError(t, err) require.NoError(t, err)
data, err := os.ReadFile(filepath.Join(configDir, "test_job.yml")) data, err := os.ReadFile(filepath.Join(configDir, "test_job.job.yml"))
require.NoError(t, err)
require.Equal(t, fmt.Sprintf(`resources:
jobs:
test_job:
name: test-job
job_clusters:
- new_cluster:
custom_tags:
"Tag1": "24X7-1234"
- new_cluster:
spark_conf:
"spark.databricks.delta.preview.enabled": "true"
tasks:
- task_key: notebook_task
notebook_task:
notebook_path: %s
parameters:
- name: empty
default: ""
`, filepath.Join("..", "src", "notebook.py")), string(data))
data, err = os.ReadFile(filepath.Join(srcDir, "notebook.py"))
require.NoError(t, err)
require.Equal(t, "# Databricks notebook source\nNotebook content", string(data))
}
func touchEmptyFile(t *testing.T, path string) {
err := os.MkdirAll(filepath.Dir(path), 0700)
require.NoError(t, err)
f, err := os.Create(path)
require.NoError(t, err)
f.Close()
}
func TestGenerateJobCommandOldFileRename(t *testing.T) {
cmd := NewGenerateJobCommand()
root := t.TempDir()
b := &bundle.Bundle{
BundleRootPath: root,
}
m := mocks.NewMockWorkspaceClient(t)
b.SetWorkpaceClient(m.WorkspaceClient)
jobsApi := m.GetMockJobsAPI()
jobsApi.EXPECT().Get(mock.Anything, jobs.GetJobRequest{JobId: 1234}).Return(&jobs.Job{
Settings: &jobs.JobSettings{
Name: "test-job",
JobClusters: []jobs.JobCluster{
{NewCluster: compute.ClusterSpec{
CustomTags: map[string]string{
"Tag1": "24X7-1234",
},
}},
{NewCluster: compute.ClusterSpec{
SparkConf: map[string]string{
"spark.databricks.delta.preview.enabled": "true",
},
}},
},
Tasks: []jobs.Task{
{
TaskKey: "notebook_task",
NotebookTask: &jobs.NotebookTask{
NotebookPath: "/test/notebook",
},
},
},
Parameters: []jobs.JobParameterDefinition{
{
Name: "empty",
Default: "",
},
},
},
}, nil)
workspaceApi := m.GetMockWorkspaceAPI()
workspaceApi.EXPECT().GetStatusByPath(mock.Anything, "/test/notebook").Return(&workspace.ObjectInfo{
ObjectType: workspace.ObjectTypeNotebook,
Language: workspace.LanguagePython,
Path: "/test/notebook",
}, nil)
notebookContent := io.NopCloser(bytes.NewBufferString("# Databricks notebook source\nNotebook content"))
workspaceApi.EXPECT().Download(mock.Anything, "/test/notebook", mock.Anything).Return(notebookContent, nil)
cmd.SetContext(bundle.Context(context.Background(), b))
cmd.Flag("existing-job-id").Value.Set("1234")
configDir := filepath.Join(root, "resources")
cmd.Flag("config-dir").Value.Set(configDir)
srcDir := filepath.Join(root, "src")
cmd.Flag("source-dir").Value.Set(srcDir)
var key string
cmd.Flags().StringVar(&key, "key", "test_job", "")
// Create an old generated file first
oldFilename := filepath.Join(configDir, "test_job.yml")
touchEmptyFile(t, oldFilename)
// Having an existing files require --force flag to regenerate them
cmd.Flag("force").Value.Set("true")
err := cmd.RunE(cmd, []string{})
require.NoError(t, err)
// Make sure file do not exists after the run
_, err = os.Stat(oldFilename)
require.True(t, errors.Is(err, fs.ErrNotExist))
data, err := os.ReadFile(filepath.Join(configDir, "test_job.job.yml"))
require.NoError(t, err) require.NoError(t, err)
require.Equal(t, fmt.Sprintf(`resources: require.Equal(t, fmt.Sprintf(`resources:

View File

@ -1,7 +1,9 @@
package generate package generate
import ( import (
"errors"
"fmt" "fmt"
"io/fs"
"os" "os"
"path/filepath" "path/filepath"
@ -83,7 +85,17 @@ func NewGenerateJobCommand() *cobra.Command {
return err return err
} }
filename := filepath.Join(configDir, fmt.Sprintf("%s.yml", jobKey)) oldFilename := filepath.Join(configDir, fmt.Sprintf("%s.yml", jobKey))
filename := filepath.Join(configDir, fmt.Sprintf("%s.job.yml", jobKey))
// User might continuously run generate command to update their bundle jobs with any changes made in Databricks UI.
// Due to changing in the generated file names, we need to first rename existing resource file to the new name.
// Otherwise users can end up with duplicated resources.
err = os.Rename(oldFilename, filename)
if err != nil && !errors.Is(err, fs.ErrNotExist) {
return fmt.Errorf("failed to rename file %s. DABs uses the resource type as a sub-extension for generated content, please rename it to %s, err: %w", oldFilename, filename, err)
}
saver := yamlsaver.NewSaverWithStyle(map[string]yaml.Style{ saver := yamlsaver.NewSaverWithStyle(map[string]yaml.Style{
// Including all JobSettings and nested fields which are map[string]string type // Including all JobSettings and nested fields which are map[string]string type
"spark_conf": yaml.DoubleQuotedStyle, "spark_conf": yaml.DoubleQuotedStyle,

View File

@ -1,7 +1,9 @@
package generate package generate
import ( import (
"errors"
"fmt" "fmt"
"io/fs"
"os" "os"
"path/filepath" "path/filepath"
@ -83,7 +85,17 @@ func NewGeneratePipelineCommand() *cobra.Command {
return err return err
} }
filename := filepath.Join(configDir, fmt.Sprintf("%s.yml", pipelineKey)) oldFilename := filepath.Join(configDir, fmt.Sprintf("%s.yml", pipelineKey))
filename := filepath.Join(configDir, fmt.Sprintf("%s.pipeline.yml", pipelineKey))
// User might continuously run generate command to update their bundle jobs with any changes made in Databricks UI.
// Due to changing in the generated file names, we need to first rename existing resource file to the new name.
// Otherwise users can end up with duplicated resources.
err = os.Rename(oldFilename, filename)
if err != nil && !errors.Is(err, fs.ErrNotExist) {
return fmt.Errorf("failed to rename file %s. DABs uses the resource type as a sub-extension for generated content, please rename it to %s, err: %w", oldFilename, filename, err)
}
saver := yamlsaver.NewSaverWithStyle( saver := yamlsaver.NewSaverWithStyle(
// Including all PipelineSpec and nested fields which are map[string]string type // Including all PipelineSpec and nested fields which are map[string]string type
map[string]yaml.Style{ map[string]yaml.Style{

View File

@ -1,6 +1,7 @@
package bundle package bundle
import ( import (
"context"
"errors" "errors"
"fmt" "fmt"
"io/fs" "io/fs"
@ -11,6 +12,8 @@ import (
"github.com/databricks/cli/cmd/root" "github.com/databricks/cli/cmd/root"
"github.com/databricks/cli/libs/cmdio" "github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/filer"
"github.com/databricks/cli/libs/git" "github.com/databricks/cli/libs/git"
"github.com/databricks/cli/libs/template" "github.com/databricks/cli/libs/template"
"github.com/spf13/cobra" "github.com/spf13/cobra"
@ -147,6 +150,26 @@ func repoName(url string) string {
return parts[len(parts)-1] return parts[len(parts)-1]
} }
func constructOutputFiler(ctx context.Context, outputDir string) (filer.Filer, error) {
outputDir, err := filepath.Abs(outputDir)
if err != nil {
return nil, err
}
// If the CLI is running on DBR and we're writing to the workspace file system,
// use the extension-aware workspace filesystem filer to instantiate the template.
//
// It is not possible to write notebooks through the workspace filesystem's FUSE mount.
// Therefore this is the only way we can initialize templates that contain notebooks
// when running the CLI on DBR and initializing a template to the workspace.
//
if strings.HasPrefix(outputDir, "/Workspace/") && dbr.RunsOnRuntime(ctx) {
return filer.NewWorkspaceFilesExtensionsClient(root.WorkspaceClient(ctx), outputDir)
}
return filer.NewLocalClient(outputDir)
}
func newInitCommand() *cobra.Command { func newInitCommand() *cobra.Command {
cmd := &cobra.Command{ cmd := &cobra.Command{
Use: "init [TEMPLATE_PATH]", Use: "init [TEMPLATE_PATH]",
@ -201,6 +224,11 @@ See https://docs.databricks.com/en/dev-tools/bundles/templates.html for more inf
templatePath = getNativeTemplateByDescription(description) templatePath = getNativeTemplateByDescription(description)
} }
outputFiler, err := constructOutputFiler(ctx, outputDir)
if err != nil {
return err
}
if templatePath == customTemplate { if templatePath == customTemplate {
cmdio.LogString(ctx, "Please specify a path or Git repository to use a custom template.") cmdio.LogString(ctx, "Please specify a path or Git repository to use a custom template.")
cmdio.LogString(ctx, "See https://docs.databricks.com/en/dev-tools/bundles/templates.html to learn more about custom templates.") cmdio.LogString(ctx, "See https://docs.databricks.com/en/dev-tools/bundles/templates.html to learn more about custom templates.")
@ -230,7 +258,7 @@ See https://docs.databricks.com/en/dev-tools/bundles/templates.html for more inf
// skip downloading the repo because input arg is not a URL. We assume // skip downloading the repo because input arg is not a URL. We assume
// it's a path on the local file system in that case // it's a path on the local file system in that case
return template.Materialize(ctx, configFile, templateFS, outputDir) return template.Materialize(ctx, configFile, templateFS, outputFiler)
} }
// Create a temporary directory with the name of the repository. The '*' // Create a temporary directory with the name of the repository. The '*'
@ -255,7 +283,7 @@ See https://docs.databricks.com/en/dev-tools/bundles/templates.html for more inf
// Clean up downloaded repository once the template is materialized. // Clean up downloaded repository once the template is materialized.
defer os.RemoveAll(repoDir) defer os.RemoveAll(repoDir)
templateFS := os.DirFS(filepath.Join(repoDir, templateDir)) templateFS := os.DirFS(filepath.Join(repoDir, templateDir))
return template.Materialize(ctx, configFile, templateFS, outputDir) return template.Materialize(ctx, configFile, templateFS, outputFiler)
} }
return cmd return cmd
} }

View File

@ -166,7 +166,7 @@ func TestAccGenerateAndBind(t *testing.T) {
_, err = os.Stat(filepath.Join(bundleRoot, "src", "test.py")) _, err = os.Stat(filepath.Join(bundleRoot, "src", "test.py"))
require.NoError(t, err) require.NoError(t, err)
matches, err := filepath.Glob(filepath.Join(bundleRoot, "resources", "test_job_key.yml")) matches, err := filepath.Glob(filepath.Join(bundleRoot, "resources", "test_job_key.job.yml"))
require.NoError(t, err) require.NoError(t, err)
require.Len(t, matches, 1) require.Len(t, matches, 1)

View File

@ -11,6 +11,11 @@
"node_type_id": { "node_type_id": {
"type": "string", "type": "string",
"description": "Node type id for job cluster" "description": "Node type id for job cluster"
},
"root_path": {
"type": "string",
"description": "Root path to deploy bundle to",
"default": ""
} }
} }
} }

View File

@ -2,7 +2,11 @@ bundle:
name: basic name: basic
workspace: workspace:
{{ if .root_path }}
root_path: "{{.root_path}}/.bundle/{{.unique_id}}"
{{ else }}
root_path: "~/.bundle/{{.unique_id}}" root_path: "~/.bundle/{{.unique_id}}"
{{ end }}
resources: resources:
jobs: jobs:

View File

@ -1,2 +0,0 @@
bundle:
name: abc

View File

@ -1,5 +1,8 @@
bundle: bundle:
name: "bundle-playground" name: recreate-pipeline
workspace:
root_path: "~/.bundle/{{.unique_id}}"
variables: variables:
catalog: catalog:

View File

@ -1,5 +1,8 @@
bundle: bundle:
name: "bundle-playground" name: uc-schema
workspace:
root_path: "~/.bundle/{{.unique_id}}"
resources: resources:
pipelines: pipelines:

View File

@ -0,0 +1,38 @@
package bundle
import (
"fmt"
"testing"
"github.com/databricks/cli/internal"
"github.com/databricks/cli/internal/acc"
"github.com/databricks/cli/libs/env"
"github.com/google/uuid"
"github.com/stretchr/testify/require"
)
func TestAccDeployBasicToSharedWorkspacePath(t *testing.T) {
ctx, wt := acc.WorkspaceTest(t)
nodeTypeId := internal.GetNodeTypeId(env.Get(ctx, "CLOUD_ENV"))
uniqueId := uuid.New().String()
currentUser, err := wt.W.CurrentUser.Me(ctx)
require.NoError(t, err)
bundleRoot, err := initTestTemplate(t, ctx, "basic", map[string]any{
"unique_id": uniqueId,
"node_type_id": nodeTypeId,
"spark_version": defaultSparkVersion,
"root_path": fmt.Sprintf("/Shared/%s", currentUser.UserName),
})
require.NoError(t, err)
t.Cleanup(func() {
err = destroyBundle(wt.T, ctx, bundleRoot)
require.NoError(wt.T, err)
})
err = deployBundle(wt.T, ctx, bundleRoot)
require.NoError(wt.T, err)
}

View File

@ -16,6 +16,7 @@ import (
"github.com/databricks/cli/internal" "github.com/databricks/cli/internal"
"github.com/databricks/cli/libs/cmdio" "github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/env" "github.com/databricks/cli/libs/env"
"github.com/databricks/cli/libs/filer"
"github.com/databricks/cli/libs/flags" "github.com/databricks/cli/libs/flags"
"github.com/databricks/cli/libs/template" "github.com/databricks/cli/libs/template"
"github.com/databricks/cli/libs/vfs" "github.com/databricks/cli/libs/vfs"
@ -42,7 +43,9 @@ func initTestTemplateWithBundleRoot(t *testing.T, ctx context.Context, templateN
cmd := cmdio.NewIO(flags.OutputJSON, strings.NewReader(""), os.Stdout, os.Stderr, "", "bundles") cmd := cmdio.NewIO(flags.OutputJSON, strings.NewReader(""), os.Stdout, os.Stderr, "", "bundles")
ctx = cmdio.InContext(ctx, cmd) ctx = cmdio.InContext(ctx, cmd)
err = template.Materialize(ctx, configFilePath, os.DirFS(templateRoot), bundleRoot) out, err := filer.NewLocalClient(bundleRoot)
require.NoError(t, err)
err = template.Materialize(ctx, configFilePath, os.DirFS(templateRoot), out)
return bundleRoot, err return bundleRoot, err
} }

View File

@ -21,8 +21,8 @@ const schemaFileName = "databricks_template_schema.json"
// ctx: context containing a cmdio object. This is used to prompt the user // ctx: context containing a cmdio object. This is used to prompt the user
// configFilePath: file path containing user defined config values // configFilePath: file path containing user defined config values
// templateFS: root of the template definition // templateFS: root of the template definition
// outputDir: root of directory where to initialize the template // outputFiler: filer to use for writing the initialized template
func Materialize(ctx context.Context, configFilePath string, templateFS fs.FS, outputDir string) error { func Materialize(ctx context.Context, configFilePath string, templateFS fs.FS, outputFiler filer.Filer) error {
if _, err := fs.Stat(templateFS, schemaFileName); errors.Is(err, fs.ErrNotExist) { if _, err := fs.Stat(templateFS, schemaFileName); errors.Is(err, fs.ErrNotExist) {
return fmt.Errorf("not a bundle template: expected to find a template schema file at %s", schemaFileName) return fmt.Errorf("not a bundle template: expected to find a template schema file at %s", schemaFileName)
} }
@ -73,12 +73,7 @@ func Materialize(ctx context.Context, configFilePath string, templateFS fs.FS, o
return err return err
} }
out, err := filer.NewLocalClient(outputDir) err = r.persistToDisk(ctx, outputFiler)
if err != nil {
return err
}
err = r.persistToDisk(ctx, out)
if err != nil { if err != nil {
return err return err
} }

View File

@ -19,6 +19,6 @@ func TestMaterializeForNonTemplateDirectory(t *testing.T) {
ctx := root.SetWorkspaceClient(context.Background(), w) ctx := root.SetWorkspaceClient(context.Background(), w)
// Try to materialize a non-template directory. // Try to materialize a non-template directory.
err = Materialize(ctx, "", os.DirFS(tmpDir), "") err = Materialize(ctx, "", os.DirFS(tmpDir), nil)
assert.EqualError(t, err, fmt.Sprintf("not a bundle template: expected to find a template schema file at %s", schemaFileName)) assert.EqualError(t, err, fmt.Sprintf("not a bundle template: expected to find a template schema file at %s", schemaFileName))
} }