Commit Graph

25 Commits

Author SHA1 Message Date
shreyas-goenka 0cc35ca056
Assert tokens are redacted in origin URL when username is not specified (#1785)
TSIA
2024-09-23 12:42:30 +00:00
Pieter Noordhuis 7de7583b37
Make fileset take optional list of paths to list (#1684)
## Changes

Before this change, the fileset library would take a single root path
and list all files in it. To support an allowlist of paths to list (much
like a Git `pathspec` without patterns; see [pathspec](pathspec)), this
change introduces an optional argument to `fileset.New` where the caller
can specify paths to list. If not specified, this argument defaults to
list `.` (i.e. list all files in the root).

The motivation for this change is that we wish to expose this pattern in
bundles. Users should be able to specify which paths to synchronize
instead of always only synchronizing the bundle root directory.

[pathspec]:
https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpathspecapathspec

## Tests

New and existing unit tests.
2024-08-19 15:15:14 +00:00
shreyas-goenka ac6b80ed88
Remove user credentials specified in the Git origin URL (#1494)
## Changes
We set the origin URL as metadata in any jobs created by DABs. This PR
makes sure user credentials do not leak into the set metadata in the
job.
 
## Tests
Unit test

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-06-17 09:49:00 +00:00
Pieter Noordhuis c9b4f11947
Update error checks that use the `os` package to use `errors.Is` (#1461)
## Changes

From the [documentation](https://pkg.go.dev/os#IsNotExist) on the
functions in the `os` package:
> This function predates errors.Is. It only supports errors returned by
the os package.
> New code should use errors.Is(err, fs.ErrNotExist).

This issue surfaced while working on using a different `vfs.Path`
implementation that uses errors from the `fs` package. Calls to
`os.IsNotExist` didn't return true for errors that wrap
`fs.ErrNotExist`.

## Tests

n/a
2024-06-03 12:39:36 +00:00
Pieter Noordhuis 424499ec1d
Abstract over filesystem interaction with libs/vfs (#1452)
## Changes

Introduce `libs/vfs` for an implementation of `fs.FS` and friends that
_includes_ the absolute path it is anchored to.

This is needed for:
1. Intercepting file operations to inject custom logic (e.g., logging,
access control).
2. Traversing directories to find specific leaf directories (e.g.,
`.git`).
3. Converting virtual paths to OS-native paths.

Options 2 and 3 are not possible with the standard `fs.FS` interface.
They are needed such that we can provide an instance to the sync package
and still detect the containing `.git` directory and convert paths to
native paths.

This change focuses on making the following packages use `vfs.Path`:
* libs/fileset
* libs/git
* libs/sync

All entries returned by `fileset.All` are now slash-separated. This has
2 consequences:
* The sync snapshot now always uses slash-separated paths
* We don't need to call `filepath.FromSlash` as much as we did

## Tests

* All unit tests pass
* All integration tests pass
* Manually confirmed that a deployment made on Windows by a previous
version of the CLI can be deployed by a new version of the CLI while
retaining the validity of the local sync snapshot as well as the remote
deployment state.
2024-05-30 07:41:50 +00:00
Pieter Noordhuis 8e58e04e8f
Move folders package into libs (#1184)
## Changes

This is the last top-level package that doesn't need to be top-level.
2024-02-07 16:33:18 +00:00
Andrew Nester b5f34a1181
Removed unused `ToHttpsUrl` method and corresponding library (#1017)
## Changes
Removed unused ToHttpsUrl method and corresponding library
2023-11-28 16:08:27 +00:00
shreyas-goenka d70d7445c4
Remove resolution of repo names against the Databricks Github account (#940)
## Changes
This functionality is not exercised (and will not be anytime soon).
Instead we use a map to have first party aliases for supported
templates.


1e46b9f88a/cmd/bundle/init.go (L21)

## Tests
Existing tests and manually, bundle init still works.
2023-11-01 13:02:06 +00:00
shreyas-goenka 5a8cd0c5bc
Persist deployment metadata in WSFS (#845)
## Changes

This PR introduces a metadata struct that stores a subset of bundle
configuration that we wish to expose to other Databricks services that
wish to integrate with bundles.

This metadata file is uploaded to a file
`${bundle.workspace.state_path}/metadata.json` in the WSFS destination
of the bundle deployment.

Documentation for emitted metadata fields:
* `version`: Version for the metadata file schema
* `config.bundle.git.branch`: Name of the git branch the bundle was
deployed from.
* `config.bundle.git.origin_url`: URL for git remote "origin"
* `config.bundle.git.bundle_root_path`: Relative path of the bundle root
from the root of the git repository. Is set to "." if they are the same.
* `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this
bundle was deployed from. Note, the deployment might not exactly match
this commit version if there are changes that have not been committed to
git at deploy time,
* `file_path`: Path in workspace where we sync bundle files to. 
* `resources.jobs.[job-ref].id`: Id of the job
* `resources.jobs.[job-ref].relative_path`: Relative path of the yaml
config file from the bundle root where this job was defined.

Example metadata object when bundle root and git root are the same:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "master",
        "origin_url": "www.host.com",
        "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0",
        "bundle_root_path": "."
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```

Example metadata when the git root is one level above the bundle repo:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "dev-branch",
        "origin_url": "www.my-repo.com",
        "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b",
        "bundle_root_path": "pipeline-progress"
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```


This unblocks integration to the jobs break glass UI for bundles.

## Tests
Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
Serge Smertin 7171874db0
Added `process.Background()` and `process.Forwarded()` (#804)
## Changes
This PR adds higher-level wrappers for calling subprocesses. One of the
steps to get https://github.com/databricks/cli/pull/637 in, as
previously discussed.

The reason to add `process.Forwarded()` is to proxy Python's `input()`
calls from a child process seamlessly. Another use-case is plugging in
`less` as a pager for the list results.

## Tests
`make test`
2023-09-27 09:04:44 +00:00
shreyas-goenka 2c58deb2c5
Fall back to full Git clone if shallow clone is not supported (#775)
## Changes
Git repos hosted over HTTP do not support shallow cloning. This PR adds
retry logic if we detect shallow cloning is not supported.

Note I saw the match string `dumb http transport does not support
shallow capabilities` being reported in for different hosts on the
internet, so this should work accross a large class of git servers.
Howerver, it's not strictly necessary to have the `--depth` flag so we
can remove it if this issue is reported again.

## Tests
Tested manually. `bundle init` successfully downloads the private HTTP
repo reported during by internal user.
2023-09-15 09:14:51 +00:00
Pieter Noordhuis c25bc041b1
Never ignore root directory when enumerating files in a repository (#683)
## Changes

The pattern `.*` in a `.gitignore` file can match `.` when walking all
files in a repository. If it does, then the walker immediately aborts
and no files are returned. The root directory (an unnamed directory)
must never be ignored.

Reported in https://github.com/databricks/databricks-vscode/issues/837.

## Tests

New tests pass.
2023-08-21 07:35:02 +00:00
Pieter Noordhuis 2a58253d20
Consolidate functions in libs/git (#652)
## Changes

The functions in `libs/git/git.go` assumed global state (e.g. working
directory) and were no longer used.

This change consolidates the functionality to turn an origin URL into an
HTTPS URL.

Closes #187.

## Tests

Expanded existing unit test.
2023-08-10 09:36:42 +00:00
shreyas-goenka d6f626912f
Fix bundle git branch validation (#645)
## Changes
This PR:
1. Fixes the computation logic for `ActualBranch`. An error in the
earlier logic caused the validation mutator to be a no-op.
2. Makes the `.git` string a global var. This is useful to configure in
tests.
3. Adds e2e test for the validation mutator.

## Tests
Unit test
2023-08-07 17:29:02 +00:00
shreyas-goenka 2f4bf844fc
Fix git clone integration test for non-existing repo (#610)
## Changes
This PR changes the integration test to just check an error is returned
rather than asserting specific text is present in the error. This is
required because the error returned can be different based on whether
git ssh keys have been setup.
2023-07-27 13:51:57 +00:00
shreyas-goenka 8fdc0fec81
Add support for cloning repositories (#544)
## Changes
Adds support for cloning public and private github repositories for
databricks templates

## Tests
Integration tests
2023-07-25 15:36:20 +02:00
shreyas-goenka f2a2d058d1
Remove \r from new line print statments (#509)
## Changes
Removes carriage character from new line prints for json output mode and
sync events

## Tests
Manually
2023-06-22 13:47:52 +02:00
Pieter Noordhuis 8979ed1394
Fix tests for new repository name (#390) 2023-05-16 19:02:07 +02:00
Pieter Noordhuis 98ebb78c9b
Rename bricks -> databricks (#389)
## Changes

Rename all instances of "bricks" to "databricks".

## Tests

* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
2023-05-16 18:35:39 +02:00
shreyas-goenka 9e16140b6e
Add git config block to bundle config (#356)
## Changes
This config block contains commit, branch and remote_url which will be
automatically loaded if specified in the repo, and can also be specified
by the user

## Tests
Unit and black-box tests
2023-04-26 16:54:36 +02:00
shreyas-goenka 42cd405eba
Add tests for fileSet adding `databricks` to .gitignore (#325)
## Changes
<!-- Summary of your changes that are easy to understand -->

These are flows that were earlier only being tested in package
`project`. Since package `project` has been deleted in
https://github.com/databricks/bricks/pull/321, we needed to add coverage
as done here

## Tests
<!-- How is this tested? -->
2023-04-12 12:04:10 +02:00
shreyas-goenka 902813a490
Hardcode `.databricks` ignore pattern to ensure we never sync the cache directory (#295)
## Changes
<!-- Summary of your changes that are easy to understand -->
1. Add pattern to always ignore .databricks
2. Best effort creation of .gitignore with .databricks if it's needed

## Tests
<!-- How is this tested? -->
2023-04-04 15:44:57 +02:00
Pieter Noordhuis 8af934bbbb
Function to find the Git repository containing a bundle (#289)
## Changes

Useful functions from #277.

## Tests

Tests pass.
2023-03-29 16:36:35 +02:00
Pieter Noordhuis abb1de99ba
Locate and use global excludes file (#191)
This implements rudimentary gitconfig loading as specified at
https://git-scm.com/docs/git-config.
2023-02-02 12:25:53 +01:00
Pieter Noordhuis 241562e2b1
Move git package to libs/git (#189)
Fixes #185.
2023-01-31 19:19:16 +01:00