Commit Graph

418 Commits

Author SHA1 Message Date
Shreyas Goenka ee9499bc68
use write testutil 2025-01-03 00:01:55 +05:30
Shreyas Goenka f4623ebbb9
cleanup 2025-01-02 23:56:58 +05:30
Shreyas Goenka ac37ca0d98
- 2025-01-02 18:54:35 +05:30
Shreyas Goenka 7ab9fb7cec
simplify copy 2025-01-02 18:53:14 +05:30
Shreyas Goenka 9552131a2a
lint 2025-01-02 18:11:35 +05:30
Shreyas Goenka 583637aed6
lint 2025-01-02 18:08:14 +05:30
Shreyas Goenka 6991dea00b
make content length work 2025-01-02 17:59:53 +05:30
Shreyas Goenka 1e2545eedf
merge 2025-01-02 17:21:17 +05:30
Shreyas Goenka f70c47253e
calculate content length before upload 2025-01-02 17:20:22 +05:30
Shreyas Goenka 890b48f70d
Reapply "add streaming uploads"
This reverts commit 8ec1e0746d.
2025-01-02 16:55:56 +05:30
Denis Bilenko 0b80784df7
Enable testifylint and fix the issues (#2065)
## Changes
- Enable new linter: testifylint.
- Apply fixes with --fix.
- Fix remaining issues (mostly with aider).

There were 2 cases we --fix did the wrong thing - this seems to a be a
bug in linter: https://github.com/Antonboom/testifylint/issues/210

Nonetheless, I kept that check enabled, it seems useful, just need to be
fixed manually after autofix.

## Tests
Existing tests
2025-01-02 12:03:41 +01:00
Denis Bilenko 3f75240a56
Improve test output to include correct location (#2058)
## Changes
- Add t.Helper() in testcli-related helpers, this ensures that output is
attributed correctly to test case and not to the helper.
- Modify testlcli.Run() to run process in foreground. This is needed for
t.Helper to work.
- Extend a few assertions with message to help attribute it to proper
helper where needed.

## Tests
Manually reviewed test output.

Before:

```
+ go test --timeout 3h -v -run TestDefaultPython/3.9 ./integration/bundle/
=== RUN   TestDefaultPython
=== RUN   TestDefaultPython/3.9
    workspace.go:26: aws
    golden.go:14: run args: [bundle, init, default-python, --config-file, config.json]
    runner.go:206: [databricks stderr]:
    runner.go:206: [databricks stderr]: Welcome to the default Python template for Databricks Asset Bundles!
...
    testdiff.go:23:
                Error Trace:    /Users/denis.bilenko/work/cli/libs/testdiff/testdiff.go:23
                                                        /Users/denis.bilenko/work/cli/libs/testdiff/golden.go:43
                                                        /Users/denis.bilenko/work/cli/internal/testcli/golden.go:23
                                                        /Users/denis.bilenko/work/cli/integration/bundle/init_default_python_test.go:92
                                                        /Users/denis.bilenko/work/cli/integration/bundle/init_default_python_test.go:45
...
```

After:

```
+ go test --timeout 3h -v -run TestDefaultPython/3.9 ./integration/bundle/
=== RUN   TestDefaultPython
=== RUN   TestDefaultPython/3.9
    init_default_python_test.go:51: CLOUD_ENV=aws
    init_default_python_test.go:92:   args: bundle, init, default-python, --config-file, config.json
    init_default_python_test.go:92: stderr:
    init_default_python_test.go:92: stderr: Welcome to the default Python template for Databricks Asset Bundles!
...
    init_default_python_test.go:92:
                Error Trace:    /Users/denis.bilenko/work/cli/libs/testdiff/testdiff.go:24
                                                        /Users/denis.bilenko/work/cli/libs/testdiff/golden.go:46
                                                        /Users/denis.bilenko/work/cli/internal/testcli/golden.go:23
                                                        /Users/denis.bilenko/work/cli/integration/bundle/init_default_python_test.go:92
                                                        /Users/denis.bilenko/work/cli/integration/bundle/init_default_python_test.go:45
...
```
2025-01-02 10:49:21 +01:00
Shreyas Goenka 8ec1e0746d
Revert "add streaming uploads"
This reverts commit 95f41b1f30.
2025-01-02 12:11:38 +05:30
Shreyas Goenka 95f41b1f30
add streaming uploads 2025-01-02 11:59:26 +05:30
Shreyas Goenka 92e97ad413
- 2024-12-31 17:11:39 +05:30
Shreyas Goenka 69fdd9736b
reduce diff 2024-12-31 17:10:19 +05:30
Shreyas Goenka ee8017357e
fix size 2024-12-31 16:07:53 +05:30
Shreyas Goenka 7084392a0f
cleanup code 2024-12-31 13:40:10 +05:30
Shreyas Goenka be62ead7be
lint 2024-12-31 13:31:38 +05:30
Shreyas Goenka cf51636faa
overwrite fix 2024-12-31 13:28:36 +05:30
Shreyas Goenka 9d8ba099ba
fix fd lingering' 2024-12-31 12:47:47 +05:30
Shreyas Goenka 932aeee349
ignore linter 2024-12-31 12:46:48 +05:30
Shreyas Goenka 63e599ccb2
added integration test 2024-12-31 12:31:37 +05:30
Shreyas Goenka 0ce50fadf0
add unit test 2024-12-31 12:09:24 +05:30
Shreyas Goenka 78b9788bc6
some cleanup 2024-12-30 22:43:45 +05:30
Shreyas Goenka da46c142e0
Merge remote-tracking branch 'origin' into multipart-dbfs 2024-12-30 21:20:17 +05:30
Denis Bilenko 261b7f4083
Move bulk of "golden tests" logic to libs/testdiff (#2054)
## Changes
- Detach "golden files" assertions from testcli runner. Now any output
can be compared, no matter how it is obtained.
- Move those assertion to libs/testdiff package.

This allows using "golden files" in non-integration tests.

## Tests
Existing tests
2024-12-30 15:26:21 +00:00
Lennart Kats (databricks) a002475a6a
Relax checks in builtin template tests (#2042)
## Changes
Relax the checks of `lib/template/builtin_test` so they don't fail for a
local development copy that has uncommitted draft templates. Right now
these tests fail because I have some git-ignored uncommitted templates
in my local dev copy.
2024-12-27 11:38:12 +00:00
Denis Bilenko e0952491c9
Add tests for default-python template on different Python versions (#2025)
## Changes
Add new type of test helpers that run the command and compare full
output (golden files approach).

In case of JSON, there is also an option to ignore certain paths.

Add test for different versions of Python to go through bundle init
default-python / validate / deploy / summary.

## Tests
New integration tests.
2024-12-20 14:40:54 +00:00
Denis Bilenko 2fee243586
Fix finding Python within virtualenv on Windows (#2034)
## Changes
Simplify logic for selecting Python to run when calculating default whl
build command: "python" on Windows and "python3" everywhere.

Python installers from python.org do not install python3.exe. In
virtualenv there is no python3.exe.

## Tests
Added new unit tests to create real venv with uv and simulate activation
by prepending venv/bin to PATH.
2024-12-20 07:45:32 +00:00
Pieter Noordhuis 965a3fcd53
Remove dependency on ghodss/yaml (#2032)
## Changes

I noticed that #1957 took a dep on this library even though we no longer
need it. This change removes the dep and cleans up other (unused) uses
of the library. We originally relied on this library to deserialize
bundle configuration and JSON payloads to non-bundle CLI commands.

Relevant commits:
* The YAML flag was added to support apps (very early), and is not
longer used: e408b701
* First use for bundle configuration loading: e47fa619
* Switch bundle configuration loading to use `libs/dyn`: 87dd46a3

## Tests

The build works without the dependency.
2024-12-19 08:23:05 +00:00
Ilya Kuznetsov 042c8d88c6
Custom annotations for bundle-specific JSON schema fields (#1957)
## Changes

Adds annotations to json-schema for fields which are not covered by
OpenAPI spec.

Custom descriptions were copy-pasted from documentation PR which is
still WIP so descriptions for some fields are missing

Further improvements:
* documentation autogen based on json-schema
* fix missing descriptions

## Tests

This script is not part of CLI package so I didn't test all corner
cases. Few high-level tests were added to be sure that schema
annotations is in sync with actual config

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-12-18 10:19:14 +00:00
Pieter Noordhuis 70b7bbfd81
Remove calls to `t.Setenv` from integration tests (#2018)
## Changes

The `Setenv` helper function configures an environment variable and
resets it to its original value when exiting the test scope. It is
incompatible with running tests in parallel because it modifies
process-wide state. The `libs/env` package defines functions to interact
with the environment but records `Setenv` calls on a `context.Context`.
This enables us to override/specialize the environment scoped to a
context.

Pre-requisites for removing the `t.Setenv` calls:
* Make `cmdio.NewIO` accept a context and use it with `libs/env`
* Make all `internal/testcli` functions use a context

The rest of this change:
* Modifies integration tests to initialize a context to use if there
wasn't already one
* Updates `t.Setenv` calls to use `env.Set`

## Tests

n/a
2024-12-16 12:34:37 +01:00
Ilia Babanov daf0f48143
Remove unused vscode settings in the templates (#2013)
## Changes
VSCode extension no longer uses `databricks.python.envFile ` setting.
And older extension versions will use the same default value anyway.

## Tests
None
2024-12-13 16:13:21 +00:00
Denis Bilenko 2e018cfaec
Enable gofumpt and goimports in golangci-lint (#1999)
## Changes
Enable gofumpt and goimports in golangci-lint and apply autofix.

This makes 'make fmt' redundant, will be cleaned up in follow up diff.

## Tests
Existing tests.
2024-12-12 10:28:42 +01:00
Denis Bilenko 592474880d
Enable 'govet' linter; expand log/diag with non-f functions (#1996)
## Changes
Fix all the govet-found issues and enable govet linter.

This prompts adding non-formatting variants of logging functions (Errorf
-> Error).

## Tests
Existing tests.
2024-12-11 16:42:03 +00:00
Denis Bilenko 8d5351c1c3
Enable errcheck everywhere and fix or silent remaining issues (#1987)
## Changes
Enable errcheck linter for the whole codebase.

Fix remaining complaints:
- If we can propagate error to caller, do that
- If we writing to stdout, continue ignoring errors (to avoid crashing
in "cli | head" case)
- Add exception for cobra non-critical API such as
MarkHidden/MarkDeprecated/RegisterFlagCompletionFunc. This keeps current
code and behaviour, to be decided later if we want to change this.
- Continue ignoring errors where that is desired behaviour (e.g.
git.loadConfig).
- Continue ignoring errors where panicking seems riskier than ignoring
the error.
- Annotate cases in libs/dyn with //nolint:errcheck - to be addressed
later.

Note, this PR is not meant to come up with the best strategy for each
case, but to be a relative safe change to enable errcheck linter.
  
## Tests
Existing tests.
2024-12-11 13:26:00 +01:00
Denis Bilenko 4236e7122f
Switch to `folders.FindDirWithLeaf` (#1963)
## Changes
Remove two duplicate implementations of the same logic, switch
everywhere to folders.FindDirWithLeaf.

Add Abs() call to FindDirWithLeaf, it cannot really work on relative
paths.

## Tests
Existing tests.
2024-12-11 09:44:22 +01:00
Denis Bilenko 1b2be1b2cb
Add error checking in tests and enable errcheck there (#1980)
## Changes
Fix all errcheck-found issues in tests and test helpers. Mostly this
done by adding require.NoError(t, err), sometimes panic() where t object
is not available).

Initial change is obtained with aider+claude, then manually reviewed and
cleaned up.

## Tests
Existing tests.
2024-12-09 13:56:41 +01:00
Denis Bilenko 4c1042132b
Enable linter bodyclose (#1968)
## Changes
Enable linter '[bodyclose](https://github.com/timakin/bodyclose)' and
fix 2 cases in tests.

## Tests
Existing tests.
2024-12-05 19:11:49 +00:00
Pieter Noordhuis 6e754d4f34
Rewrite 'interface{} -> any' (#1959)
## Changes

The `any` alias for `interface{}` has been around since Go 1.18.

Now that we're using golangci-lint (#1953), we can lint on it.

Existing commits can be updated with:
```
gofmt -w -r 'interface{} -> any' .
```

## Tests

n/a
2024-12-05 15:37:24 +00:00
Shreyas Goenka f690f0a342
Merge remote-tracking branch 'origin' into multipart-dbfs 2024-12-05 15:59:07 +05:30
Denis Bilenko 0ad790e468
Properly read Git metadata when running inside workspace (#1945)
## Changes

Since there is no .git directory in Workspace file system, we need to
make an API call to api/2.0/workspace/get-status?return_git_info=true to
fetch git the root of the repo, current branch, commit and origin.

Added new function FetchRepositoryInfo that either looks up and parses
.git or calls remote API depending on env.

Refactor Repository/View/FileSet to accept repository root rather than
calculate it. This helps because:
- Repository is currently created in multiple places and finding the
repository root is becoming relatively expensive (API call needed).
- Repository/FileSet/View do not have access to current Bundle which is
where WorkplaceClient is stored.

## Tests

- Tested manually by running "bundle validate --json" inside web
terminal within Databricks env.
- Added integration tests for the new API.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-12-05 10:13:13 +00:00
Shreyas Goenka 04cae2f8cf
Merge remote-tracking branch 'origin' into multipart-dbfs 2024-12-04 20:19:59 +01:00
Shreyas Goenka 4b484fdcdc
todo 2024-12-03 00:07:14 +01:00
Shreyas Goenka 06af01c8f6
refactor to make tests easier 2024-12-03 00:05:04 +01:00
Shreyas Goenka 91a2dfa0ed
- 2024-12-02 23:49:32 +01:00
Shreyas Goenka c4cea1aeff
Use the `/dbfs/put` API endpoint to upload smaller DBFS files 2024-12-02 23:47:31 +01:00
shreyas-goenka 2847533e1e
Add DABs support for Unity Catalog volumes (#1762)
## Changes

This PR adds support for UC volumes to DABs.

### Can I use a UC volume managed by DABs in `artifact_path`?

Yes, but we require the volume to exist before being referenced in
`artifact_path`. Otherwise you'll see an error that the volume does not
exist. For this case, this PR also adds a warning if we detect that the
UC volume is defined in the DAB itself, which informs the user to deploy
the UC volume in a separate deployment first before using it in
`artifact_path`.

We cannot create the UC volume and then upload the artifacts to it in
the same `bundle deploy` because `bundle deploy` always uploads the
artifacts to `artifact_path` before materializing any resources defined
in the bundle. Supporting this in a single deployment requires us to
migrate away from our dependency on the Databricks Terraform provider to
manage the CRUD lifecycle of DABs resources.

### Why do we not support `preset.name_prefix` for UC volumes?

UC volumes will not have a `dev_shreyas_goenka` prefix added in `mode:
development`. Configuring `presets.name_prefix` will be a no-op for UC
volumes. We have decided not to support prefixing for UC resources. This
is because:
1. UC provides its own namespace hierarchy that is independent of DABs.
2. Users can always manually use `${workspace.current_user.short_name}`
to configure the prefixes manually.

Customers often manually set up a UC hierarchy for dev and prod,
including a schema or catalog per developer. Thus, it's often
unnecessary for us to add prefixing in `mode: development` by default
for UC resources.

In retrospect, supporting prefixing for UC schemas and registered models
was a mistake and will be removed in a future release of DABs.

## Tests
Unit, integration test, and manually. 

### Manual Testing cases:
 1. UC volume does not exist:
```
➜  bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/my_volume that is configured in the artifact_path: Not Found
```

2. UC Volume does not exist, but is defined in the DAB
```
➜  bundle-playground git:(master) ✗ cli bundle deploy
Error: failed to fetch metadata for the UC volume /Volumes/main/caps/managed_by_dab that is configured in the artifact_path: Not Found

Warning: You might be using a UC volume in your artifact_path that is managed by this bundle but which has not been deployed yet. Please deploy the UC volume in a separate bundle deploy before using it in the artifact_path.
  at resources.volumes.bar
  in databricks.yml:24:7

```

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-12-02 21:18:07 +00:00
shreyas-goenka f9d65f315f
Add comment for why we test two files for `bundle_uuid` (#1949)
## Changes
Addresses feedback from this thread
https://github.com/databricks/cli/pull/1947#discussion_r1865692479
2024-12-02 14:40:57 +00:00