## Changes
Before this change, the fileset library would take a single root path
and list all files in it. To support an allowlist of paths to list (much
like a Git `pathspec` without patterns; see [pathspec](pathspec)), this
change introduces an optional argument to `fileset.New` where the caller
can specify paths to list. If not specified, this argument defaults to
list `.` (i.e. list all files in the root).
The motivation for this change is that we wish to expose this pattern in
bundles. Users should be able to specify which paths to synchronize
instead of always only synchronizing the bundle root directory.
[pathspec]:
https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec
## Tests
New and existing unit tests.
## Changes
We set the origin URL as metadata in any jobs created by DABs. This PR
makes sure user credentials do not leak into the set metadata in the
job.
## Tests
Unit test
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
From the [documentation](https://pkg.go.dev/os#IsNotExist) on the
functions in the `os` package:
> This function predates errors.Is. It only supports errors returned by
the os package.
> New code should use errors.Is(err, fs.ErrNotExist).
This issue surfaced while working on using a different `vfs.Path`
implementation that uses errors from the `fs` package. Calls to
`os.IsNotExist` didn't return true for errors that wrap
`fs.ErrNotExist`.
## Tests
n/a
## Changes
Introduce `libs/vfs` for an implementation of `fs.FS` and friends that
_includes_ the absolute path it is anchored to.
This is needed for:
1. Intercepting file operations to inject custom logic (e.g., logging,
access control).
2. Traversing directories to find specific leaf directories (e.g.,
`.git`).
3. Converting virtual paths to OS-native paths.
Options 2 and 3 are not possible with the standard `fs.FS` interface.
They are needed such that we can provide an instance to the sync package
and still detect the containing `.git` directory and convert paths to
native paths.
This change focuses on making the following packages use `vfs.Path`:
* libs/fileset
* libs/git
* libs/sync
All entries returned by `fileset.All` are now slash-separated. This has
2 consequences:
* The sync snapshot now always uses slash-separated paths
* We don't need to call `filepath.FromSlash` as much as we did
## Tests
* All unit tests pass
* All integration tests pass
* Manually confirmed that a deployment made on Windows by a previous
version of the CLI can be deployed by a new version of the CLI while
retaining the validity of the local sync snapshot as well as the remote
deployment state.
## Changes
This functionality is not exercised (and will not be anytime soon).
Instead we use a map to have first party aliases for supported
templates.
1e46b9f88a/cmd/bundle/init.go (L21)
## Tests
Existing tests and manually, bundle init still works.
## Changes
This PR introduces a metadata struct that stores a subset of bundle
configuration that we wish to expose to other Databricks services that
wish to integrate with bundles.
This metadata file is uploaded to a file
`${bundle.workspace.state_path}/metadata.json` in the WSFS destination
of the bundle deployment.
Documentation for emitted metadata fields:
* `version`: Version for the metadata file schema
* `config.bundle.git.branch`: Name of the git branch the bundle was
deployed from.
* `config.bundle.git.origin_url`: URL for git remote "origin"
* `config.bundle.git.bundle_root_path`: Relative path of the bundle root
from the root of the git repository. Is set to "." if they are the same.
* `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this
bundle was deployed from. Note, the deployment might not exactly match
this commit version if there are changes that have not been committed to
git at deploy time,
* `file_path`: Path in workspace where we sync bundle files to.
* `resources.jobs.[job-ref].id`: Id of the job
* `resources.jobs.[job-ref].relative_path`: Relative path of the yaml
config file from the bundle root where this job was defined.
Example metadata object when bundle root and git root are the same:
```json
{
"version": 1,
"config": {
"bundle": {
"lock": {},
"git": {
"branch": "master",
"origin_url": "www.host.com",
"commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0",
"bundle_root_path": "."
}
},
"workspace": {
"file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
},
"resources": {
"jobs": {
"bar": {
"id": "245921165354846",
"relative_path": "databricks.yml"
}
}
},
"sync": {}
}
}
```
Example metadata when the git root is one level above the bundle repo:
```json
{
"version": 1,
"config": {
"bundle": {
"lock": {},
"git": {
"branch": "dev-branch",
"origin_url": "www.my-repo.com",
"commit": "3db46ef750998952b00a2b3e7991e31787e4b98b",
"bundle_root_path": "pipeline-progress"
}
},
"workspace": {
"file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
},
"resources": {
"jobs": {
"bar": {
"id": "245921165354846",
"relative_path": "databricks.yml"
}
}
},
"sync": {}
}
}
```
This unblocks integration to the jobs break glass UI for bundles.
## Tests
Unit tests and integration tests.
## Changes
This PR adds higher-level wrappers for calling subprocesses. One of the
steps to get https://github.com/databricks/cli/pull/637 in, as
previously discussed.
The reason to add `process.Forwarded()` is to proxy Python's `input()`
calls from a child process seamlessly. Another use-case is plugging in
`less` as a pager for the list results.
## Tests
`make test`
## Changes
Git repos hosted over HTTP do not support shallow cloning. This PR adds
retry logic if we detect shallow cloning is not supported.
Note I saw the match string `dumb http transport does not support
shallow capabilities` being reported in for different hosts on the
internet, so this should work accross a large class of git servers.
Howerver, it's not strictly necessary to have the `--depth` flag so we
can remove it if this issue is reported again.
## Tests
Tested manually. `bundle init` successfully downloads the private HTTP
repo reported during by internal user.
## Changes
The pattern `.*` in a `.gitignore` file can match `.` when walking all
files in a repository. If it does, then the walker immediately aborts
and no files are returned. The root directory (an unnamed directory)
must never be ignored.
Reported in https://github.com/databricks/databricks-vscode/issues/837.
## Tests
New tests pass.
## Changes
The functions in `libs/git/git.go` assumed global state (e.g. working
directory) and were no longer used.
This change consolidates the functionality to turn an origin URL into an
HTTPS URL.
Closes#187.
## Tests
Expanded existing unit test.
## Changes
This PR:
1. Fixes the computation logic for `ActualBranch`. An error in the
earlier logic caused the validation mutator to be a no-op.
2. Makes the `.git` string a global var. This is useful to configure in
tests.
3. Adds e2e test for the validation mutator.
## Tests
Unit test
## Changes
This PR changes the integration test to just check an error is returned
rather than asserting specific text is present in the error. This is
required because the error returned can be different based on whether
git ssh keys have been setup.
## Changes
Rename all instances of "bricks" to "databricks".
## Tests
* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
## Changes
This config block contains commit, branch and remote_url which will be
automatically loaded if specified in the repo, and can also be specified
by the user
## Tests
Unit and black-box tests
## Changes
<!-- Summary of your changes that are easy to understand -->
These are flows that were earlier only being tested in package
`project`. Since package `project` has been deleted in
https://github.com/databricks/bricks/pull/321, we needed to add coverage
as done here
## Tests
<!-- How is this tested? -->
## Changes
<!-- Summary of your changes that are easy to understand -->
1. Add pattern to always ignore .databricks
2. Best effort creation of .gitignore with .databricks if it's needed
## Tests
<!-- How is this tested? -->