Compare commits

...

12 Commits

Author SHA1 Message Date
Gleb Kanterov 3ae9ce952b
Merge 96a6cef0d6 into 984c38e03e 2024-11-20 17:40:31 +00:00
shreyas-goenka 984c38e03e
Add unique ID to `root_path` for bundle integration test fixtures (#1917)
## Changes
Integration tests using these fixtures could have been flaky when run in
parallel using the same user's identity. They would also possibly have
piggybacked state from previous runs.

This PR adds a UUID to the root_path to force independent bundle
deployments for every test run.

I have checked that all bundles in `internal/bundle/bundles` have
`root_path` namespaced to a UUID.

## Tests
Self testing.
2024-11-20 16:30:10 +00:00
Pieter Noordhuis ade95d9649
[Release] Release v0.235.0 (#1918)
**Note:** the `bundle generate` command now uses the
`.<resource-type>.yml`
sub-extension for the configuration files it writes. Existing
configuration
files that do not use this sub-extension are renamed to include it.

Bundles:
* Make `TableName` field part of quality monitor schema
([#1903](https://github.com/databricks/cli/pull/1903)).
* Do not prepend paths starting with ~ or variable reference
([#1905](https://github.com/databricks/cli/pull/1905)).
* Fix workspace extensions filer accidentally reading notebooks
([#1891](https://github.com/databricks/cli/pull/1891)).
* Fix template initialization when running on Databricks
([#1912](https://github.com/databricks/cli/pull/1912)).
* Source-linked deployments for bundles in the workspace
([#1884](https://github.com/databricks/cli/pull/1884)).
* Added integration test to deploy bundle to /Shared root path
([#1914](https://github.com/databricks/cli/pull/1914)).
* Update filenames used by bundle generate to use `.<resource-type>.yml`
([#1901](https://github.com/databricks/cli/pull/1901)).

Internal:
* Extract functionality to detect if the CLI is running on DBR
([#1889](https://github.com/databricks/cli/pull/1889)).
* Consolidate test helpers for `io/fs`
([#1906](https://github.com/databricks/cli/pull/1906)).
* Use `fs.FS` interface to read template
([#1910](https://github.com/databricks/cli/pull/1910)).
* Use `filer.Filer` to write template instantiation
([#1911](https://github.com/databricks/cli/pull/1911)).
2024-11-20 14:48:18 +00:00
Andrew Nester 592e1111b7
Update filenames used by bundle generate to use `.<resource-type>.yml` (#1901)
## Changes
Update filenames used by bundle generate to use '.resource-type.yml'

Similar to [Add sub-extension to resource files in built-in templates by
shreyas-goenka · Pull Request #1777 ·
databricks/cli](https://github.com/databricks/cli/pull/1777)

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-11-20 13:53:25 +01:00
Andrew Nester fab3e8f168
Added integration test to deploy bundle to /Shared root path (#1914)
## Changes
Added integration test to deploy bundle to /Shared root path

## Tests
```
--- PASS: TestAccDeployBasicToSharedWorkspace (24.58s)
PASS
coverage: 31.2% of statements in ./...
ok  	github.com/databricks/cli/internal/bundle	25.572s	coverage: 31.2% of statements in ./...
```

---------

Co-authored-by: shreyas-goenka <88374338+shreyas-goenka@users.noreply.github.com>
2024-11-20 12:20:39 +00:00
Ilya Kuznetsov 756e55fabc
Source-linked deployments for bundles in the workspace (#1884)
## Changes

This change adds a preset for source-linked deployments. It is enabled
by default for targets in `development` mode **if** the Databricks CLI
is running from the `/Workspace` directory on DBR. It does not have an
effect when running the CLI anywhere else.

Key highlights:
1. Files in this mode won't be uploaded to workspace
2. Created resources will use references to source files instead of
their workspace copies

## Tests
1. Apply preset unit test covering conditional logic
2. High-level process target mode unit test for testing integration
between mutators

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-11-20 13:22:27 +01:00
Pieter Noordhuis 886e14910c
Fix template initialization when running on Databricks (#1912)
## Changes

When running the CLI on Databricks Runtime (DBR), use the
extension-aware filer to write an instantiated template if the instance
path is located in the workspace filesystem.

Notebooks cannot be written through the workspace filesystem's FUSE
mount. As a result, this is the only method for initializing templates
that contain notebooks when running the CLI on DBR and writing to the
workspace filesystem.

Depends on #1910 and #1911.

Supersedes #1744.

## Tests

* Manually confirmed I can initialize a template with notebooks when
running the CLI from the web terminal.
2024-11-20 11:42:23 +00:00
Gleb Kanterov 96a6cef0d6
Address feedbacK 2024-10-08 10:28:51 +02:00
Gleb Kanterov bfb13afa8e
Address more feedback 2024-10-08 10:26:53 +02:00
Gleb Kanterov 43ce278299
Rename bundle root path 2024-10-08 10:18:52 +02:00
Gleb Kanterov df61375995
Address CR comments 2024-10-08 10:18:37 +02:00
Gleb Kanterov 3438455459
PythonMutator: propagate source locations 2024-10-08 10:18:36 +02:00
30 changed files with 1024 additions and 54 deletions

View File

@ -1,5 +1,28 @@
# Version changelog
## [Release] Release v0.235.0
**Note:** the `bundle generate` command now uses the `.<resource-type>.yml`
sub-extension for the configuration files it writes. Existing configuration
files that do not use this sub-extension are renamed to include it.
Bundles:
* Make `TableName` field part of quality monitor schema ([#1903](https://github.com/databricks/cli/pull/1903)).
* Do not prepend paths starting with ~ or variable reference ([#1905](https://github.com/databricks/cli/pull/1905)).
* Fix workspace extensions filer accidentally reading notebooks ([#1891](https://github.com/databricks/cli/pull/1891)).
* Fix template initialization when running on Databricks ([#1912](https://github.com/databricks/cli/pull/1912)).
* Source-linked deployments for bundles in the workspace ([#1884](https://github.com/databricks/cli/pull/1884)).
* Added integration test to deploy bundle to /Shared root path ([#1914](https://github.com/databricks/cli/pull/1914)).
* Update filenames used by bundle generate to use `.<resource-type>.yml` ([#1901](https://github.com/databricks/cli/pull/1901)).
Internal:
* Extract functionality to detect if the CLI is running on DBR ([#1889](https://github.com/databricks/cli/pull/1889)).
* Consolidate test helpers for `io/fs` ([#1906](https://github.com/databricks/cli/pull/1906)).
* Use `fs.FS` interface to read template ([#1910](https://github.com/databricks/cli/pull/1910)).
* Use `filer.Filer` to write template instantiation ([#1911](https://github.com/databricks/cli/pull/1911)).
## [Release] Release v0.234.0
Bundles:

View File

@ -45,6 +45,12 @@ type PyDABs struct {
// These packages are imported to discover resources, resource generators, and mutators.
// This list can include namespace packages, which causes the import of nested packages.
Import []string `json:"import,omitempty"`
// LoadLocations is a flag to enable loading Python source locations from the PyDABs.
//
// Locations are only supported since PyDABs 0.6.0, and because of that,
// this flag is disabled by default.
LoadLocations bool `json:"load_locations,omitempty"`
}
type Command string

View File

@ -9,6 +9,7 @@ import (
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/textutil"
@ -221,6 +222,15 @@ func (m *applyPresets) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnos
dashboard.DisplayName = prefix + dashboard.DisplayName
}
if config.IsExplicitlyEnabled((b.Config.Presets.SourceLinkedDeployment)) {
isDatabricksWorkspace := dbr.RunsOnRuntime(ctx) && strings.HasPrefix(b.SyncRootPath, "/Workspace/")
if !isDatabricksWorkspace {
disabled := false
b.Config.Presets.SourceLinkedDeployment = &disabled
diags = diags.Extend(diag.Warningf("source-linked deployment is available only in the Databricks Workspace"))
}
}
return diags
}

View File

@ -2,12 +2,14 @@ package mutator_test
import (
"context"
"runtime"
"testing"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/mutator"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/jobs"
"github.com/stretchr/testify/require"
@ -364,3 +366,86 @@ func TestApplyPresetsResourceNotDefined(t *testing.T) {
})
}
}
func TestApplyPresetsSourceLinkedDeployment(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("this test is not applicable on Windows because source-linked mode works only in the Databricks Workspace")
}
testContext := context.Background()
enabled := true
disabled := false
workspacePath := "/Workspace/user.name@company.com"
tests := []struct {
bundlePath string
ctx context.Context
name string
initialValue *bool
expectedValue *bool
expectedWarning string
}{
{
name: "preset enabled, bundle in Workspace, databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, true),
initialValue: &enabled,
expectedValue: &enabled,
},
{
name: "preset enabled, bundle not in Workspace, databricks runtime",
bundlePath: "/Users/user.name@company.com",
ctx: dbr.MockRuntime(testContext, true),
initialValue: &enabled,
expectedValue: &disabled,
expectedWarning: "source-linked deployment is available only in the Databricks Workspace",
},
{
name: "preset enabled, bundle in Workspace, not databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, false),
initialValue: &enabled,
expectedValue: &disabled,
expectedWarning: "source-linked deployment is available only in the Databricks Workspace",
},
{
name: "preset disabled, bundle in Workspace, databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, true),
initialValue: &disabled,
expectedValue: &disabled,
},
{
name: "preset nil, bundle in Workspace, databricks runtime",
bundlePath: workspacePath,
ctx: dbr.MockRuntime(testContext, true),
initialValue: nil,
expectedValue: nil,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
b := &bundle.Bundle{
SyncRootPath: tt.bundlePath,
Config: config.Root{
Presets: config.Presets{
SourceLinkedDeployment: tt.initialValue,
},
},
}
diags := bundle.Apply(tt.ctx, b, mutator.ApplyPresets())
if diags.HasError() {
t.Fatalf("unexpected error: %v", diags)
}
if tt.expectedWarning != "" {
require.Equal(t, tt.expectedWarning, diags[0].Summary)
}
require.Equal(t, tt.expectedValue, b.Config.Presets.SourceLinkedDeployment)
})
}
}

View File

@ -6,6 +6,7 @@ import (
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/iamutil"
@ -57,6 +58,14 @@ func transformDevelopmentMode(ctx context.Context, b *bundle.Bundle) {
t.TriggerPauseStatus = config.Paused
}
if !config.IsExplicitlyDisabled(t.SourceLinkedDeployment) {
isInWorkspace := strings.HasPrefix(b.SyncRootPath, "/Workspace/")
if isInWorkspace && dbr.RunsOnRuntime(ctx) {
enabled := true
t.SourceLinkedDeployment = &enabled
}
}
if !config.IsExplicitlyDisabled(t.PipelinesDevelopment) {
enabled := true
t.PipelinesDevelopment = &enabled

View File

@ -3,14 +3,17 @@ package mutator
import (
"context"
"reflect"
"runtime"
"strings"
"testing"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/config/resources"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/tags"
"github.com/databricks/cli/libs/vfs"
sdkconfig "github.com/databricks/databricks-sdk-go/config"
"github.com/databricks/databricks-sdk-go/service/catalog"
"github.com/databricks/databricks-sdk-go/service/compute"
@ -140,6 +143,7 @@ func mockBundle(mode config.Mode) *bundle.Bundle {
},
},
},
SyncRoot: vfs.MustNew("/Users/lennart.kats@databricks.com"),
// Use AWS implementation for testing.
Tagging: tags.ForCloud(&sdkconfig.Config{
Host: "https://company.cloud.databricks.com",
@ -522,3 +526,32 @@ func TestPipelinesDevelopmentDisabled(t *testing.T) {
assert.False(t, b.Config.Resources.Pipelines["pipeline1"].PipelineSpec.Development)
}
func TestSourceLinkedDeploymentEnabled(t *testing.T) {
b, diags := processSourceLinkedBundle(t, true)
require.NoError(t, diags.Error())
assert.True(t, *b.Config.Presets.SourceLinkedDeployment)
}
func TestSourceLinkedDeploymentDisabled(t *testing.T) {
b, diags := processSourceLinkedBundle(t, false)
require.NoError(t, diags.Error())
assert.False(t, *b.Config.Presets.SourceLinkedDeployment)
}
func processSourceLinkedBundle(t *testing.T, presetEnabled bool) (*bundle.Bundle, diag.Diagnostics) {
if runtime.GOOS == "windows" {
t.Skip("this test is not applicable on Windows because source-linked mode works only in the Databricks Workspace")
}
b := mockBundle(config.Development)
workspacePath := "/Workspace/lennart@company.com/"
b.SyncRootPath = workspacePath
b.Config.Presets.SourceLinkedDeployment = &presetEnabled
ctx := dbr.MockRuntime(context.Background(), true)
m := bundle.Seq(ProcessTargetMode(), ApplyPresets())
diags := bundle.Apply(ctx, b, m)
return b, diags
}

View File

@ -9,6 +9,7 @@ import (
"github.com/databricks/cli/libs/dyn"
)
// pythonDiagnostic is a single entry in diagnostics.json
type pythonDiagnostic struct {
Severity pythonSeverity `json:"severity"`
Summary string `json:"summary"`

View File

@ -0,0 +1,181 @@
package python
import (
"encoding/json"
"fmt"
"io"
"path/filepath"
"github.com/databricks/cli/libs/dyn"
)
// generatedFileName is used as the virtual file name for YAML generated by PyDABs.
//
// mergePythonLocations replaces dyn.Location with generatedFileName with locations loaded
// from locations.json
const generatedFileName = "__generated_by_pydabs__.yml"
// pythonLocations is data structure for efficient location lookup for a given path
//
// Locations form a tree, and we assign locations of the closest ancestor to each dyn.Value based on its path.
// We implement it as a trie (prefix tree) where keys are components of the path. With that, lookups are O(n)
// where n is the number of components in the path.
//
// For example, with locations.json:
//
// {"path": "resources.jobs.job_0", "file": "src/examples/job_0.py", "line": 3, "column": 5}
// {"path": "resources.jobs.job_0.tasks[0].task_key", "file": "src/examples/job_0.py", "line": 10, "column": 5}
// {"path": "resources.jobs.job_1", "file": "src/examples/job_1.py", "line": 5, "column": 7}
//
// - resources.jobs.job_0.tasks[0].task_key is located at job_0.py:10:5
//
// - resources.jobs.job_0.tasks[0].email_notifications is located at job_0.py:3:5,
// because we use the location of the job as the most precise approximation.
type pythonLocations struct {
// descendants referenced by index, e.g. '.foo'
keys map[string]*pythonLocations
// descendants referenced by key, e.g. '[0]'
indexes map[int]*pythonLocations
// location for the current node if it exists
location dyn.Location
// if true, location is present
exists bool
}
// pythonLocationEntry is a single entry in locations.json
type pythonLocationEntry struct {
Path string `json:"path"`
File string `json:"file"`
Line int `json:"line"`
Column int `json:"column"`
}
// mergePythonLocations applies locations from Python mutator into given dyn.Value
//
// The primary use-case is to merge locations.json with output.json, so that any
// validation errors will point to Python source code instead of generated YAML.
func mergePythonLocations(value dyn.Value, locations *pythonLocations) (dyn.Value, error) {
return dyn.Walk(value, func(path dyn.Path, value dyn.Value) (dyn.Value, error) {
newLocation, ok := findPythonLocation(locations, path)
if !ok {
return value, nil
}
var newLocations []dyn.Location
// the first item in the list is the "last" location used for error reporting
newLocations = append(newLocations, newLocation)
for _, location := range value.Locations() {
// When loaded, dyn.Value created by PyDABs use the virtual file path as their location,
// we replace it with newLocation.
if filepath.Base(location.File) == generatedFileName {
continue
}
newLocations = append(newLocations, location)
}
return value.WithLocations(newLocations), nil
})
}
// parsePythonLocations parses locations.json from the Python mutator.
//
// locations file is newline-separated JSON objects with pythonLocationEntry structure.
func parsePythonLocations(input io.Reader) (*pythonLocations, error) {
decoder := json.NewDecoder(input)
locations := newPythonLocations()
for decoder.More() {
var entry pythonLocationEntry
err := decoder.Decode(&entry)
if err != nil {
return nil, fmt.Errorf("failed to parse python location: %s", err)
}
path, err := dyn.NewPathFromString(entry.Path)
if err != nil {
return nil, fmt.Errorf("failed to parse python location: %s", err)
}
location := dyn.Location{
File: entry.File,
Line: entry.Line,
Column: entry.Column,
}
putPythonLocation(locations, path, location)
}
return locations, nil
}
// putPythonLocation puts the location to the trie for the given path
func putPythonLocation(trie *pythonLocations, path dyn.Path, location dyn.Location) {
var currentNode = trie
for _, component := range path {
if key := component.Key(); key != "" {
if _, ok := currentNode.keys[key]; !ok {
currentNode.keys[key] = newPythonLocations()
}
currentNode = currentNode.keys[key]
} else {
index := component.Index()
if _, ok := currentNode.indexes[index]; !ok {
currentNode.indexes[index] = newPythonLocations()
}
currentNode = currentNode.indexes[index]
}
}
currentNode.location = location
currentNode.exists = true
}
// newPythonLocations creates a new trie node
func newPythonLocations() *pythonLocations {
return &pythonLocations{
keys: make(map[string]*pythonLocations),
indexes: make(map[int]*pythonLocations),
}
}
// findPythonLocation finds the location or closest ancestor location in the trie for the given path
// if no ancestor or exact location is found, false is returned.
func findPythonLocation(locations *pythonLocations, path dyn.Path) (dyn.Location, bool) {
var currentNode = locations
var lastLocation = locations.location
var exists = locations.exists
for _, component := range path {
if key := component.Key(); key != "" {
if _, ok := currentNode.keys[key]; !ok {
break
}
currentNode = currentNode.keys[key]
} else {
index := component.Index()
if _, ok := currentNode.indexes[index]; !ok {
break
}
currentNode = currentNode.indexes[index]
}
if currentNode.exists {
lastLocation = currentNode.location
exists = true
}
}
return lastLocation, exists
}

View File

@ -0,0 +1,124 @@
package python
import (
"bytes"
"testing"
"github.com/databricks/cli/libs/dyn"
assert "github.com/databricks/cli/libs/dyn/dynassert"
)
func TestMergeLocations(t *testing.T) {
pythonLocation := dyn.Location{File: "foo.py", Line: 1, Column: 1}
generatedLocation := dyn.Location{File: generatedFileName, Line: 1, Column: 1}
yamlLocation := dyn.Location{File: "foo.yml", Line: 1, Column: 1}
locations := newPythonLocations()
putPythonLocation(locations, dyn.MustPathFromString("foo"), pythonLocation)
input := dyn.NewValue(
map[string]dyn.Value{
"foo": dyn.NewValue(
map[string]dyn.Value{
"baz": dyn.NewValue("baz", []dyn.Location{yamlLocation}),
"qux": dyn.NewValue("baz", []dyn.Location{generatedLocation, yamlLocation}),
},
[]dyn.Location{},
),
"bar": dyn.NewValue("baz", []dyn.Location{generatedLocation}),
},
[]dyn.Location{yamlLocation},
)
expected := dyn.NewValue(
map[string]dyn.Value{
"foo": dyn.NewValue(
map[string]dyn.Value{
// pythonLocation is appended to the beginning of the list if absent
"baz": dyn.NewValue("baz", []dyn.Location{pythonLocation, yamlLocation}),
// generatedLocation is replaced by pythonLocation
"qux": dyn.NewValue("baz", []dyn.Location{pythonLocation, yamlLocation}),
},
[]dyn.Location{pythonLocation},
),
// if location is unknown, we keep it as-is
"bar": dyn.NewValue("baz", []dyn.Location{generatedLocation}),
},
[]dyn.Location{yamlLocation},
)
actual, err := mergePythonLocations(input, locations)
assert.NoError(t, err)
assert.Equal(t, expected, actual)
}
func TestFindLocation(t *testing.T) {
location0 := dyn.Location{File: "foo.py", Line: 1, Column: 1}
location1 := dyn.Location{File: "foo.py", Line: 2, Column: 1}
locations := newPythonLocations()
putPythonLocation(locations, dyn.MustPathFromString("foo"), location0)
putPythonLocation(locations, dyn.MustPathFromString("foo.bar"), location1)
actual, exists := findPythonLocation(locations, dyn.MustPathFromString("foo.bar"))
assert.True(t, exists)
assert.Equal(t, location1, actual)
}
func TestFindLocation_indexPathComponent(t *testing.T) {
location0 := dyn.Location{File: "foo.py", Line: 1, Column: 1}
location1 := dyn.Location{File: "foo.py", Line: 2, Column: 1}
location2 := dyn.Location{File: "foo.py", Line: 3, Column: 1}
locations := newPythonLocations()
putPythonLocation(locations, dyn.MustPathFromString("foo"), location0)
putPythonLocation(locations, dyn.MustPathFromString("foo.bar"), location1)
putPythonLocation(locations, dyn.MustPathFromString("foo.bar[0]"), location2)
actual, exists := findPythonLocation(locations, dyn.MustPathFromString("foo.bar[0]"))
assert.True(t, exists)
assert.Equal(t, location2, actual)
}
func TestFindLocation_closestAncestorLocation(t *testing.T) {
location0 := dyn.Location{File: "foo.py", Line: 1, Column: 1}
location1 := dyn.Location{File: "foo.py", Line: 2, Column: 1}
locations := newPythonLocations()
putPythonLocation(locations, dyn.MustPathFromString("foo"), location0)
putPythonLocation(locations, dyn.MustPathFromString("foo.bar"), location1)
actual, exists := findPythonLocation(locations, dyn.MustPathFromString("foo.bar.baz"))
assert.True(t, exists)
assert.Equal(t, location1, actual)
}
func TestFindLocation_unknownLocation(t *testing.T) {
location0 := dyn.Location{File: "foo.py", Line: 1, Column: 1}
location1 := dyn.Location{File: "foo.py", Line: 2, Column: 1}
locations := newPythonLocations()
putPythonLocation(locations, dyn.MustPathFromString("foo"), location0)
putPythonLocation(locations, dyn.MustPathFromString("foo.bar"), location1)
_, exists := findPythonLocation(locations, dyn.MustPathFromString("bar"))
assert.False(t, exists)
}
func TestParsePythonLocations(t *testing.T) {
expected := dyn.Location{File: "foo.py", Line: 1, Column: 2}
input := `{"path": "foo", "file": "foo.py", "line": 1, "column": 2}`
reader := bytes.NewReader([]byte(input))
locations, err := parsePythonLocations(reader)
assert.NoError(t, err)
assert.True(t, locations.keys["foo"].exists)
assert.Equal(t, expected, locations.keys["foo"].location)
}

View File

@ -7,9 +7,12 @@ import (
"errors"
"fmt"
"io"
"io/fs"
"os"
"path/filepath"
"github.com/databricks/cli/bundle/config/mutator/paths"
"github.com/databricks/databricks-sdk-go/logger"
"github.com/fatih/color"
@ -108,7 +111,12 @@ func (m *pythonMutator) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagno
return dyn.InvalidValue, fmt.Errorf("failed to create cache dir: %w", err)
}
rightRoot, diags := m.runPythonMutator(ctx, cacheDir, b.BundleRootPath, pythonPath, leftRoot)
rightRoot, diags := m.runPythonMutator(ctx, leftRoot, runPythonMutatorOpts{
cacheDir: cacheDir,
bundleRootPath: b.BundleRootPath,
pythonPath: pythonPath,
loadLocations: experimental.PyDABs.LoadLocations,
})
mutateDiags = diags
if diags.HasError() {
return dyn.InvalidValue, mutateDiagsHasError
@ -152,13 +160,21 @@ func createCacheDir(ctx context.Context) (string, error) {
return os.MkdirTemp("", "-pydabs")
}
func (m *pythonMutator) runPythonMutator(ctx context.Context, cacheDir string, rootPath string, pythonPath string, root dyn.Value) (dyn.Value, diag.Diagnostics) {
inputPath := filepath.Join(cacheDir, "input.json")
outputPath := filepath.Join(cacheDir, "output.json")
diagnosticsPath := filepath.Join(cacheDir, "diagnostics.json")
type runPythonMutatorOpts struct {
cacheDir string
bundleRootPath string
pythonPath string
loadLocations bool
}
func (m *pythonMutator) runPythonMutator(ctx context.Context, root dyn.Value, opts runPythonMutatorOpts) (dyn.Value, diag.Diagnostics) {
inputPath := filepath.Join(opts.cacheDir, "input.json")
outputPath := filepath.Join(opts.cacheDir, "output.json")
diagnosticsPath := filepath.Join(opts.cacheDir, "diagnostics.json")
locationsPath := filepath.Join(opts.cacheDir, "locations.json")
args := []string{
pythonPath,
opts.pythonPath,
"-m",
"databricks.bundles.build",
"--phase",
@ -171,6 +187,10 @@ func (m *pythonMutator) runPythonMutator(ctx context.Context, cacheDir string, r
diagnosticsPath,
}
if opts.loadLocations {
args = append(args, "--locations", locationsPath)
}
if err := writeInputFile(inputPath, root); err != nil {
return dyn.InvalidValue, diag.Errorf("failed to write input file: %s", err)
}
@ -185,7 +205,7 @@ func (m *pythonMutator) runPythonMutator(ctx context.Context, cacheDir string, r
_, processErr := process.Background(
ctx,
args,
process.WithDir(rootPath),
process.WithDir(opts.bundleRootPath),
process.WithStderrWriter(stderrWriter),
process.WithStdoutWriter(stdoutWriter),
)
@ -221,7 +241,12 @@ func (m *pythonMutator) runPythonMutator(ctx context.Context, cacheDir string, r
return dyn.InvalidValue, diag.Errorf("failed to load diagnostics: %s", pythonDiagnosticsErr)
}
output, outputDiags := loadOutputFile(rootPath, outputPath)
locations, err := loadLocationsFile(locationsPath)
if err != nil {
return dyn.InvalidValue, diag.Errorf("failed to load locations: %s", err)
}
output, outputDiags := loadOutputFile(opts.bundleRootPath, outputPath, locations)
pythonDiagnostics = pythonDiagnostics.Extend(outputDiags)
// we pass through pythonDiagnostic because it contains warnings
@ -266,7 +291,21 @@ func writeInputFile(inputPath string, input dyn.Value) error {
return os.WriteFile(inputPath, rootConfigJson, 0600)
}
func loadOutputFile(rootPath string, outputPath string) (dyn.Value, diag.Diagnostics) {
// loadLocationsFile loads locations.json containing source locations for generated YAML.
func loadLocationsFile(locationsPath string) (*pythonLocations, error) {
locationsFile, err := os.Open(locationsPath)
if errors.Is(err, fs.ErrNotExist) {
return newPythonLocations(), nil
} else if err != nil {
return nil, fmt.Errorf("failed to open locations file: %w", err)
}
defer locationsFile.Close()
return parsePythonLocations(locationsFile)
}
func loadOutputFile(rootPath string, outputPath string, locations *pythonLocations) (dyn.Value, diag.Diagnostics) {
outputFile, err := os.Open(outputPath)
if err != nil {
return dyn.InvalidValue, diag.FromErr(fmt.Errorf("failed to open output file: %w", err))
@ -277,12 +316,12 @@ func loadOutputFile(rootPath string, outputPath string) (dyn.Value, diag.Diagnos
// we need absolute path because later parts of pipeline assume all paths are absolute
// and this file will be used as location to resolve relative paths.
//
// virtualPath has to stay in rootPath, because locations outside root path are not allowed:
// virtualPath has to stay in bundleRootPath, because locations outside root path are not allowed:
//
// Error: path /var/folders/.../pydabs/dist/*.whl is not contained in bundle root path
//
// for that, we pass virtualPath instead of outputPath as file location
virtualPath, err := filepath.Abs(filepath.Join(rootPath, "__generated_by_pydabs__.yml"))
virtualPath, err := filepath.Abs(filepath.Join(rootPath, generatedFileName))
if err != nil {
return dyn.InvalidValue, diag.FromErr(fmt.Errorf("failed to get absolute path: %w", err))
}
@ -292,7 +331,25 @@ func loadOutputFile(rootPath string, outputPath string) (dyn.Value, diag.Diagnos
return dyn.InvalidValue, diag.FromErr(fmt.Errorf("failed to parse output file: %w", err))
}
return strictNormalize(config.Root{}, generated)
// paths are resolved relative to locations of their values, if we change location
// we have to update each path, until we simplify that, we don't update locations
// for such values, so we don't change how paths are resolved
_, err = paths.VisitJobPaths(generated, func(p dyn.Path, kind paths.PathKind, v dyn.Value) (dyn.Value, error) {
putPythonLocation(locations, p, v.Location())
return v, nil
})
if err != nil {
return dyn.InvalidValue, diag.FromErr(fmt.Errorf("failed to update locations: %w", err))
}
// generated has dyn.Location as if it comes from generated YAML file
// earlier we loaded locations.json with source locations in Python code
generatedWithLocations, err := mergePythonLocations(generated, locations)
if err != nil {
return dyn.InvalidValue, diag.FromErr(fmt.Errorf("failed to update locations: %w", err))
}
return strictNormalize(config.Root{}, generatedWithLocations)
}
func strictNormalize(dst any, generated dyn.Value) (dyn.Value, diag.Diagnostics) {

View File

@ -6,7 +6,6 @@ import (
"os"
"os/exec"
"path/filepath"
"reflect"
"runtime"
"testing"
@ -47,6 +46,7 @@ func TestPythonMutator_load(t *testing.T) {
pydabs:
enabled: true
venv_path: .venv
load_locations: true
resources:
jobs:
job0:
@ -65,7 +65,8 @@ func TestPythonMutator_load(t *testing.T) {
"experimental": {
"pydabs": {
"enabled": true,
"venv_path": ".venv"
"venv_path": ".venv",
"load_locations": true
}
},
"resources": {
@ -80,6 +81,8 @@ func TestPythonMutator_load(t *testing.T) {
}
}`,
`{"severity": "warning", "summary": "job doesn't have any tasks", "location": {"file": "src/examples/file.py", "line": 10, "column": 5}}`,
`{"path": "resources.jobs.job0", "file": "src/examples/job0.py", "line": 3, "column": 5}
{"path": "resources.jobs.job1", "file": "src/examples/job1.py", "line": 5, "column": 7}`,
)
mutator := PythonMutator(PythonMutatorPhaseLoad)
@ -97,6 +100,25 @@ func TestPythonMutator_load(t *testing.T) {
assert.Equal(t, "job_1", job1.Name)
}
// output of locations.json should be applied to underlying dyn.Value
err := b.Config.Mutate(func(v dyn.Value) (dyn.Value, error) {
name1, err := dyn.GetByPath(v, dyn.MustPathFromString("resources.jobs.job1.name"))
if err != nil {
return dyn.InvalidValue, err
}
assert.Equal(t, []dyn.Location{
{
File: "src/examples/job1.py",
Line: 5,
Column: 7,
},
}, name1.Locations())
return v, nil
})
assert.NoError(t, err)
assert.Equal(t, 1, len(diags))
assert.Equal(t, "job doesn't have any tasks", diags[0].Summary)
assert.Equal(t, []dyn.Location{
@ -106,7 +128,6 @@ func TestPythonMutator_load(t *testing.T) {
Column: 5,
},
}, diags[0].Locations)
}
func TestPythonMutator_load_disallowed(t *testing.T) {
@ -146,7 +167,7 @@ func TestPythonMutator_load_disallowed(t *testing.T) {
}
}
}
}`, "")
}`, "", "")
mutator := PythonMutator(PythonMutatorPhaseLoad)
diag := bundle.Apply(ctx, b, mutator)
@ -191,7 +212,7 @@ func TestPythonMutator_init(t *testing.T) {
}
}
}
}`, "")
}`, "", "")
mutator := PythonMutator(PythonMutatorPhaseInit)
diag := bundle.Apply(ctx, b, mutator)
@ -252,7 +273,7 @@ func TestPythonMutator_badOutput(t *testing.T) {
}
}
}
}`, "")
}`, "", "")
mutator := PythonMutator(PythonMutatorPhaseLoad)
diag := bundle.Apply(ctx, b, mutator)
@ -588,7 +609,7 @@ or activate the environment before running CLI commands:
assert.Equal(t, expected, out)
}
func withProcessStub(t *testing.T, args []string, output string, diagnostics string) context.Context {
func withProcessStub(t *testing.T, args []string, output string, diagnostics string, locations string) context.Context {
ctx := context.Background()
ctx, stub := process.WithStub(ctx)
@ -600,32 +621,51 @@ func withProcessStub(t *testing.T, args []string, output string, diagnostics str
inputPath := filepath.Join(cacheDir, "input.json")
outputPath := filepath.Join(cacheDir, "output.json")
locationsPath := filepath.Join(cacheDir, "locations.json")
diagnosticsPath := filepath.Join(cacheDir, "diagnostics.json")
args = append(args, "--input", inputPath)
args = append(args, "--output", outputPath)
args = append(args, "--diagnostics", diagnosticsPath)
stub.WithCallback(func(actual *exec.Cmd) error {
_, err := os.Stat(inputPath)
assert.NoError(t, err)
if reflect.DeepEqual(actual.Args, args) {
err := os.WriteFile(outputPath, []byte(output), 0600)
actualInputPath := getArg(actual.Args, "--input")
actualOutputPath := getArg(actual.Args, "--output")
actualDiagnosticsPath := getArg(actual.Args, "--diagnostics")
actualLocationsPath := getArg(actual.Args, "--locations")
require.Equal(t, inputPath, actualInputPath)
require.Equal(t, outputPath, actualOutputPath)
require.Equal(t, diagnosticsPath, actualDiagnosticsPath)
// locations is an optional argument
if locations != "" {
require.Equal(t, locationsPath, actualLocationsPath)
err = os.WriteFile(locationsPath, []byte(locations), 0600)
require.NoError(t, err)
}
err = os.WriteFile(outputPath, []byte(output), 0600)
require.NoError(t, err)
err = os.WriteFile(diagnosticsPath, []byte(diagnostics), 0600)
require.NoError(t, err)
return nil
} else {
return fmt.Errorf("unexpected command: %v", actual.Args)
}
})
return ctx
}
func getArg(args []string, name string) string {
for i := 0; i < len(args); i++ {
if args[i] == name {
return args[i+1]
}
}
return ""
}
func loadYaml(name string, content string) *bundle.Bundle {
v, diag := config.LoadFromBytes(name, []byte(content))

View File

@ -11,6 +11,7 @@ import (
"strings"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/dyn"
"github.com/databricks/cli/libs/notebook"
@ -103,8 +104,13 @@ func (t *translateContext) rewritePath(
return fmt.Errorf("path %s is not contained in sync root path", localPath)
}
// Prefix remote path with its remote root path.
remotePath := path.Join(t.b.Config.Workspace.FilePath, filepath.ToSlash(localRelPath))
var workspacePath string
if config.IsExplicitlyEnabled(t.b.Config.Presets.SourceLinkedDeployment) {
workspacePath = t.b.SyncRootPath
} else {
workspacePath = t.b.Config.Workspace.FilePath
}
remotePath := path.Join(workspacePath, filepath.ToSlash(localRelPath))
// Convert local path into workspace path via specified function.
interp, err := fn(*p, localPath, localRelPath, remotePath)

View File

@ -4,6 +4,7 @@ import (
"context"
"os"
"path/filepath"
"runtime"
"strings"
"testing"
@ -787,3 +788,163 @@ func TestTranslatePathWithComplexVariables(t *testing.T) {
b.Config.Resources.Jobs["job"].Tasks[0].Libraries[0].Whl,
)
}
func TestTranslatePathsWithSourceLinkedDeployment(t *testing.T) {
if runtime.GOOS == "windows" {
t.Skip("this test is not applicable on Windows because source-linked mode works only in the Databricks Workspace")
}
dir := t.TempDir()
touchNotebookFile(t, filepath.Join(dir, "my_job_notebook.py"))
touchNotebookFile(t, filepath.Join(dir, "my_pipeline_notebook.py"))
touchEmptyFile(t, filepath.Join(dir, "my_python_file.py"))
touchEmptyFile(t, filepath.Join(dir, "dist", "task.jar"))
touchEmptyFile(t, filepath.Join(dir, "requirements.txt"))
enabled := true
b := &bundle.Bundle{
SyncRootPath: dir,
SyncRoot: vfs.MustNew(dir),
Config: config.Root{
Workspace: config.Workspace{
FilePath: "/bundle",
},
Resources: config.Resources{
Jobs: map[string]*resources.Job{
"job": {
JobSettings: &jobs.JobSettings{
Tasks: []jobs.Task{
{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "my_job_notebook.py",
},
Libraries: []compute.Library{
{Whl: "./dist/task.whl"},
},
},
{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "/Users/jane.doe@databricks.com/absolute_remote.py",
},
},
{
NotebookTask: &jobs.NotebookTask{
NotebookPath: "my_job_notebook.py",
},
Libraries: []compute.Library{
{Requirements: "requirements.txt"},
},
},
{
SparkPythonTask: &jobs.SparkPythonTask{
PythonFile: "my_python_file.py",
},
},
{
SparkJarTask: &jobs.SparkJarTask{
MainClassName: "HelloWorld",
},
Libraries: []compute.Library{
{Jar: "./dist/task.jar"},
},
},
{
SparkJarTask: &jobs.SparkJarTask{
MainClassName: "HelloWorldRemote",
},
Libraries: []compute.Library{
{Jar: "dbfs:/bundle/dist/task_remote.jar"},
},
},
},
},
},
},
Pipelines: map[string]*resources.Pipeline{
"pipeline": {
PipelineSpec: &pipelines.PipelineSpec{
Libraries: []pipelines.PipelineLibrary{
{
Notebook: &pipelines.NotebookLibrary{
Path: "my_pipeline_notebook.py",
},
},
{
Notebook: &pipelines.NotebookLibrary{
Path: "/Users/jane.doe@databricks.com/absolute_remote.py",
},
},
{
File: &pipelines.FileLibrary{
Path: "my_python_file.py",
},
},
},
},
},
},
},
Presets: config.Presets{
SourceLinkedDeployment: &enabled,
},
},
}
bundletest.SetLocation(b, ".", []dyn.Location{{File: filepath.Join(dir, "resource.yml")}})
diags := bundle.Apply(context.Background(), b, mutator.TranslatePaths())
require.NoError(t, diags.Error())
// updated to source path
assert.Equal(
t,
filepath.Join(dir, "my_job_notebook"),
b.Config.Resources.Jobs["job"].Tasks[0].NotebookTask.NotebookPath,
)
assert.Equal(
t,
filepath.Join(dir, "requirements.txt"),
b.Config.Resources.Jobs["job"].Tasks[2].Libraries[0].Requirements,
)
assert.Equal(
t,
filepath.Join(dir, "my_python_file.py"),
b.Config.Resources.Jobs["job"].Tasks[3].SparkPythonTask.PythonFile,
)
assert.Equal(
t,
filepath.Join(dir, "my_pipeline_notebook"),
b.Config.Resources.Pipelines["pipeline"].Libraries[0].Notebook.Path,
)
assert.Equal(
t,
filepath.Join(dir, "my_python_file.py"),
b.Config.Resources.Pipelines["pipeline"].Libraries[2].File.Path,
)
// left as is
assert.Equal(
t,
filepath.Join("dist", "task.whl"),
b.Config.Resources.Jobs["job"].Tasks[0].Libraries[0].Whl,
)
assert.Equal(
t,
"/Users/jane.doe@databricks.com/absolute_remote.py",
b.Config.Resources.Jobs["job"].Tasks[1].NotebookTask.NotebookPath,
)
assert.Equal(
t,
filepath.Join("dist", "task.jar"),
b.Config.Resources.Jobs["job"].Tasks[4].Libraries[0].Jar,
)
assert.Equal(
t,
"dbfs:/bundle/dist/task_remote.jar",
b.Config.Resources.Jobs["job"].Tasks[5].Libraries[0].Jar,
)
assert.Equal(
t,
"/Users/jane.doe@databricks.com/absolute_remote.py",
b.Config.Resources.Pipelines["pipeline"].Libraries[1].Notebook.Path,
)
}

View File

@ -17,6 +17,11 @@ type Presets struct {
// JobsMaxConcurrentRuns is the default value for the max concurrent runs of jobs.
JobsMaxConcurrentRuns int `json:"jobs_max_concurrent_runs,omitempty"`
// SourceLinkedDeployment indicates whether source-linked deployment is enabled. Works only in Databricks Workspace
// When set to true, resources created during deployment will point to source files in the workspace instead of their workspace copies.
// File synchronization to ${workspace.file_path} is skipped.
SourceLinkedDeployment *bool `json:"source_linked_deployment,omitempty"`
// Tags to add to all resources.
Tags map[string]string `json:"tags,omitempty"`
}

View File

@ -7,6 +7,7 @@ import (
"io/fs"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/permissions"
"github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/diag"
@ -23,6 +24,11 @@ func (m *upload) Name() string {
}
func (m *upload) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
if config.IsExplicitlyEnabled(b.Config.Presets.SourceLinkedDeployment) {
cmdio.LogString(ctx, "Source-linked deployment is enabled. Deployed resources reference the source files in your working tree instead of separate copies.")
return nil
}
cmdio.LogString(ctx, fmt.Sprintf("Uploading bundle files to %s...", b.Config.Workspace.FilePath))
opts, err := GetSyncOptions(ctx, bundle.ReadOnly(b))
if err != nil {

View File

@ -6,6 +6,7 @@ import (
"strings"
"github.com/databricks/cli/bundle"
"github.com/databricks/cli/bundle/config"
"github.com/databricks/cli/bundle/libraries"
"github.com/databricks/cli/libs/diag"
"github.com/databricks/cli/libs/log"
@ -22,6 +23,9 @@ func WrapperWarning() bundle.Mutator {
func (m *wrapperWarning) Apply(ctx context.Context, b *bundle.Bundle) diag.Diagnostics {
if isPythonWheelWrapperOn(b) {
if config.IsExplicitlyEnabled(b.Config.Presets.SourceLinkedDeployment) {
return diag.Warningf("Python wheel notebook wrapper is not available when using source-linked deployment mode. You can disable this mode by setting 'presets.source_linked_deployment: false'")
}
return nil
}

View File

@ -3,8 +3,10 @@ package generate
import (
"bytes"
"context"
"errors"
"fmt"
"io"
"io/fs"
"os"
"path/filepath"
"testing"
@ -90,7 +92,7 @@ func TestGeneratePipelineCommand(t *testing.T) {
err := cmd.RunE(cmd, []string{})
require.NoError(t, err)
data, err := os.ReadFile(filepath.Join(configDir, "test_pipeline.yml"))
data, err := os.ReadFile(filepath.Join(configDir, "test_pipeline.pipeline.yml"))
require.NoError(t, err)
require.Equal(t, fmt.Sprintf(`resources:
pipelines:
@ -186,7 +188,123 @@ func TestGenerateJobCommand(t *testing.T) {
err := cmd.RunE(cmd, []string{})
require.NoError(t, err)
data, err := os.ReadFile(filepath.Join(configDir, "test_job.yml"))
data, err := os.ReadFile(filepath.Join(configDir, "test_job.job.yml"))
require.NoError(t, err)
require.Equal(t, fmt.Sprintf(`resources:
jobs:
test_job:
name: test-job
job_clusters:
- new_cluster:
custom_tags:
"Tag1": "24X7-1234"
- new_cluster:
spark_conf:
"spark.databricks.delta.preview.enabled": "true"
tasks:
- task_key: notebook_task
notebook_task:
notebook_path: %s
parameters:
- name: empty
default: ""
`, filepath.Join("..", "src", "notebook.py")), string(data))
data, err = os.ReadFile(filepath.Join(srcDir, "notebook.py"))
require.NoError(t, err)
require.Equal(t, "# Databricks notebook source\nNotebook content", string(data))
}
func touchEmptyFile(t *testing.T, path string) {
err := os.MkdirAll(filepath.Dir(path), 0700)
require.NoError(t, err)
f, err := os.Create(path)
require.NoError(t, err)
f.Close()
}
func TestGenerateJobCommandOldFileRename(t *testing.T) {
cmd := NewGenerateJobCommand()
root := t.TempDir()
b := &bundle.Bundle{
BundleRootPath: root,
}
m := mocks.NewMockWorkspaceClient(t)
b.SetWorkpaceClient(m.WorkspaceClient)
jobsApi := m.GetMockJobsAPI()
jobsApi.EXPECT().Get(mock.Anything, jobs.GetJobRequest{JobId: 1234}).Return(&jobs.Job{
Settings: &jobs.JobSettings{
Name: "test-job",
JobClusters: []jobs.JobCluster{
{NewCluster: compute.ClusterSpec{
CustomTags: map[string]string{
"Tag1": "24X7-1234",
},
}},
{NewCluster: compute.ClusterSpec{
SparkConf: map[string]string{
"spark.databricks.delta.preview.enabled": "true",
},
}},
},
Tasks: []jobs.Task{
{
TaskKey: "notebook_task",
NotebookTask: &jobs.NotebookTask{
NotebookPath: "/test/notebook",
},
},
},
Parameters: []jobs.JobParameterDefinition{
{
Name: "empty",
Default: "",
},
},
},
}, nil)
workspaceApi := m.GetMockWorkspaceAPI()
workspaceApi.EXPECT().GetStatusByPath(mock.Anything, "/test/notebook").Return(&workspace.ObjectInfo{
ObjectType: workspace.ObjectTypeNotebook,
Language: workspace.LanguagePython,
Path: "/test/notebook",
}, nil)
notebookContent := io.NopCloser(bytes.NewBufferString("# Databricks notebook source\nNotebook content"))
workspaceApi.EXPECT().Download(mock.Anything, "/test/notebook", mock.Anything).Return(notebookContent, nil)
cmd.SetContext(bundle.Context(context.Background(), b))
cmd.Flag("existing-job-id").Value.Set("1234")
configDir := filepath.Join(root, "resources")
cmd.Flag("config-dir").Value.Set(configDir)
srcDir := filepath.Join(root, "src")
cmd.Flag("source-dir").Value.Set(srcDir)
var key string
cmd.Flags().StringVar(&key, "key", "test_job", "")
// Create an old generated file first
oldFilename := filepath.Join(configDir, "test_job.yml")
touchEmptyFile(t, oldFilename)
// Having an existing files require --force flag to regenerate them
cmd.Flag("force").Value.Set("true")
err := cmd.RunE(cmd, []string{})
require.NoError(t, err)
// Make sure file do not exists after the run
_, err = os.Stat(oldFilename)
require.True(t, errors.Is(err, fs.ErrNotExist))
data, err := os.ReadFile(filepath.Join(configDir, "test_job.job.yml"))
require.NoError(t, err)
require.Equal(t, fmt.Sprintf(`resources:

View File

@ -1,7 +1,9 @@
package generate
import (
"errors"
"fmt"
"io/fs"
"os"
"path/filepath"
@ -83,7 +85,17 @@ func NewGenerateJobCommand() *cobra.Command {
return err
}
filename := filepath.Join(configDir, fmt.Sprintf("%s.yml", jobKey))
oldFilename := filepath.Join(configDir, fmt.Sprintf("%s.yml", jobKey))
filename := filepath.Join(configDir, fmt.Sprintf("%s.job.yml", jobKey))
// User might continuously run generate command to update their bundle jobs with any changes made in Databricks UI.
// Due to changing in the generated file names, we need to first rename existing resource file to the new name.
// Otherwise users can end up with duplicated resources.
err = os.Rename(oldFilename, filename)
if err != nil && !errors.Is(err, fs.ErrNotExist) {
return fmt.Errorf("failed to rename file %s. DABs uses the resource type as a sub-extension for generated content, please rename it to %s, err: %w", oldFilename, filename, err)
}
saver := yamlsaver.NewSaverWithStyle(map[string]yaml.Style{
// Including all JobSettings and nested fields which are map[string]string type
"spark_conf": yaml.DoubleQuotedStyle,

View File

@ -1,7 +1,9 @@
package generate
import (
"errors"
"fmt"
"io/fs"
"os"
"path/filepath"
@ -83,7 +85,17 @@ func NewGeneratePipelineCommand() *cobra.Command {
return err
}
filename := filepath.Join(configDir, fmt.Sprintf("%s.yml", pipelineKey))
oldFilename := filepath.Join(configDir, fmt.Sprintf("%s.yml", pipelineKey))
filename := filepath.Join(configDir, fmt.Sprintf("%s.pipeline.yml", pipelineKey))
// User might continuously run generate command to update their bundle jobs with any changes made in Databricks UI.
// Due to changing in the generated file names, we need to first rename existing resource file to the new name.
// Otherwise users can end up with duplicated resources.
err = os.Rename(oldFilename, filename)
if err != nil && !errors.Is(err, fs.ErrNotExist) {
return fmt.Errorf("failed to rename file %s. DABs uses the resource type as a sub-extension for generated content, please rename it to %s, err: %w", oldFilename, filename, err)
}
saver := yamlsaver.NewSaverWithStyle(
// Including all PipelineSpec and nested fields which are map[string]string type
map[string]yaml.Style{

View File

@ -1,6 +1,7 @@
package bundle
import (
"context"
"errors"
"fmt"
"io/fs"
@ -11,6 +12,8 @@ import (
"github.com/databricks/cli/cmd/root"
"github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/dbr"
"github.com/databricks/cli/libs/filer"
"github.com/databricks/cli/libs/git"
"github.com/databricks/cli/libs/template"
"github.com/spf13/cobra"
@ -147,6 +150,26 @@ func repoName(url string) string {
return parts[len(parts)-1]
}
func constructOutputFiler(ctx context.Context, outputDir string) (filer.Filer, error) {
outputDir, err := filepath.Abs(outputDir)
if err != nil {
return nil, err
}
// If the CLI is running on DBR and we're writing to the workspace file system,
// use the extension-aware workspace filesystem filer to instantiate the template.
//
// It is not possible to write notebooks through the workspace filesystem's FUSE mount.
// Therefore this is the only way we can initialize templates that contain notebooks
// when running the CLI on DBR and initializing a template to the workspace.
//
if strings.HasPrefix(outputDir, "/Workspace/") && dbr.RunsOnRuntime(ctx) {
return filer.NewWorkspaceFilesExtensionsClient(root.WorkspaceClient(ctx), outputDir)
}
return filer.NewLocalClient(outputDir)
}
func newInitCommand() *cobra.Command {
cmd := &cobra.Command{
Use: "init [TEMPLATE_PATH]",
@ -201,6 +224,11 @@ See https://docs.databricks.com/en/dev-tools/bundles/templates.html for more inf
templatePath = getNativeTemplateByDescription(description)
}
outputFiler, err := constructOutputFiler(ctx, outputDir)
if err != nil {
return err
}
if templatePath == customTemplate {
cmdio.LogString(ctx, "Please specify a path or Git repository to use a custom template.")
cmdio.LogString(ctx, "See https://docs.databricks.com/en/dev-tools/bundles/templates.html to learn more about custom templates.")
@ -230,7 +258,7 @@ See https://docs.databricks.com/en/dev-tools/bundles/templates.html for more inf
// skip downloading the repo because input arg is not a URL. We assume
// it's a path on the local file system in that case
return template.Materialize(ctx, configFile, templateFS, outputDir)
return template.Materialize(ctx, configFile, templateFS, outputFiler)
}
// Create a temporary directory with the name of the repository. The '*'
@ -255,7 +283,7 @@ See https://docs.databricks.com/en/dev-tools/bundles/templates.html for more inf
// Clean up downloaded repository once the template is materialized.
defer os.RemoveAll(repoDir)
templateFS := os.DirFS(filepath.Join(repoDir, templateDir))
return template.Materialize(ctx, configFile, templateFS, outputDir)
return template.Materialize(ctx, configFile, templateFS, outputFiler)
}
return cmd
}

View File

@ -166,7 +166,7 @@ func TestAccGenerateAndBind(t *testing.T) {
_, err = os.Stat(filepath.Join(bundleRoot, "src", "test.py"))
require.NoError(t, err)
matches, err := filepath.Glob(filepath.Join(bundleRoot, "resources", "test_job_key.yml"))
matches, err := filepath.Glob(filepath.Join(bundleRoot, "resources", "test_job_key.job.yml"))
require.NoError(t, err)
require.Len(t, matches, 1)

View File

@ -11,6 +11,11 @@
"node_type_id": {
"type": "string",
"description": "Node type id for job cluster"
},
"root_path": {
"type": "string",
"description": "Root path to deploy bundle to",
"default": ""
}
}
}

View File

@ -2,7 +2,11 @@ bundle:
name: basic
workspace:
{{ if .root_path }}
root_path: "{{.root_path}}/.bundle/{{.unique_id}}"
{{ else }}
root_path: "~/.bundle/{{.unique_id}}"
{{ end }}
resources:
jobs:

View File

@ -1,2 +0,0 @@
bundle:
name: abc

View File

@ -1,5 +1,8 @@
bundle:
name: "bundle-playground"
name: recreate-pipeline
workspace:
root_path: "~/.bundle/{{.unique_id}}"
variables:
catalog:

View File

@ -1,5 +1,8 @@
bundle:
name: "bundle-playground"
name: uc-schema
workspace:
root_path: "~/.bundle/{{.unique_id}}"
resources:
pipelines:

View File

@ -0,0 +1,38 @@
package bundle
import (
"fmt"
"testing"
"github.com/databricks/cli/internal"
"github.com/databricks/cli/internal/acc"
"github.com/databricks/cli/libs/env"
"github.com/google/uuid"
"github.com/stretchr/testify/require"
)
func TestAccDeployBasicToSharedWorkspacePath(t *testing.T) {
ctx, wt := acc.WorkspaceTest(t)
nodeTypeId := internal.GetNodeTypeId(env.Get(ctx, "CLOUD_ENV"))
uniqueId := uuid.New().String()
currentUser, err := wt.W.CurrentUser.Me(ctx)
require.NoError(t, err)
bundleRoot, err := initTestTemplate(t, ctx, "basic", map[string]any{
"unique_id": uniqueId,
"node_type_id": nodeTypeId,
"spark_version": defaultSparkVersion,
"root_path": fmt.Sprintf("/Shared/%s", currentUser.UserName),
})
require.NoError(t, err)
t.Cleanup(func() {
err = destroyBundle(wt.T, ctx, bundleRoot)
require.NoError(wt.T, err)
})
err = deployBundle(wt.T, ctx, bundleRoot)
require.NoError(wt.T, err)
}

View File

@ -16,6 +16,7 @@ import (
"github.com/databricks/cli/internal"
"github.com/databricks/cli/libs/cmdio"
"github.com/databricks/cli/libs/env"
"github.com/databricks/cli/libs/filer"
"github.com/databricks/cli/libs/flags"
"github.com/databricks/cli/libs/template"
"github.com/databricks/cli/libs/vfs"
@ -42,7 +43,9 @@ func initTestTemplateWithBundleRoot(t *testing.T, ctx context.Context, templateN
cmd := cmdio.NewIO(flags.OutputJSON, strings.NewReader(""), os.Stdout, os.Stderr, "", "bundles")
ctx = cmdio.InContext(ctx, cmd)
err = template.Materialize(ctx, configFilePath, os.DirFS(templateRoot), bundleRoot)
out, err := filer.NewLocalClient(bundleRoot)
require.NoError(t, err)
err = template.Materialize(ctx, configFilePath, os.DirFS(templateRoot), out)
return bundleRoot, err
}

View File

@ -21,8 +21,8 @@ const schemaFileName = "databricks_template_schema.json"
// ctx: context containing a cmdio object. This is used to prompt the user
// configFilePath: file path containing user defined config values
// templateFS: root of the template definition
// outputDir: root of directory where to initialize the template
func Materialize(ctx context.Context, configFilePath string, templateFS fs.FS, outputDir string) error {
// outputFiler: filer to use for writing the initialized template
func Materialize(ctx context.Context, configFilePath string, templateFS fs.FS, outputFiler filer.Filer) error {
if _, err := fs.Stat(templateFS, schemaFileName); errors.Is(err, fs.ErrNotExist) {
return fmt.Errorf("not a bundle template: expected to find a template schema file at %s", schemaFileName)
}
@ -73,12 +73,7 @@ func Materialize(ctx context.Context, configFilePath string, templateFS fs.FS, o
return err
}
out, err := filer.NewLocalClient(outputDir)
if err != nil {
return err
}
err = r.persistToDisk(ctx, out)
err = r.persistToDisk(ctx, outputFiler)
if err != nil {
return err
}

View File

@ -19,6 +19,6 @@ func TestMaterializeForNonTemplateDirectory(t *testing.T) {
ctx := root.SetWorkspaceClient(context.Background(), w)
// Try to materialize a non-template directory.
err = Materialize(ctx, "", os.DirFS(tmpDir), "")
err = Materialize(ctx, "", os.DirFS(tmpDir), nil)
assert.EqualError(t, err, fmt.Sprintf("not a bundle template: expected to find a template schema file at %s", schemaFileName))
}