Commit Graph

17 Commits

Author SHA1 Message Date
Denis Bilenko 0ad790e468
Properly read Git metadata when running inside workspace (#1945)
## Changes

Since there is no .git directory in Workspace file system, we need to
make an API call to api/2.0/workspace/get-status?return_git_info=true to
fetch git the root of the repo, current branch, commit and origin.

Added new function FetchRepositoryInfo that either looks up and parses
.git or calls remote API depending on env.

Refactor Repository/View/FileSet to accept repository root rather than
calculate it. This helps because:
- Repository is currently created in multiple places and finding the
repository root is becoming relatively expensive (API call needed).
- Repository/FileSet/View do not have access to current Bundle which is
where WorkplaceClient is stored.

## Tests

- Tested manually by running "bundle validate --json" inside web
terminal within Databricks env.
- Added integration tests for the new API.

---------

Co-authored-by: Andrew Nester <andrew.nester@databricks.com>
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-12-05 10:13:13 +00:00
Denis Bilenko e57cbf1273
Remove unused field: Repository.real (#1936) 2024-11-27 12:14:39 +00:00
Stephen Macke 571076d5e1
Support Git worktrees for `sync` (#1831)
## Changes

This change allows the `sync` command to work from [git
worktrees](https://git-scm.com/docs/git-worktree).

## Tests

* Added unit tests for traversal of worktree related files.
* Manually confirmed that synchronization of files from a main checkout,
as well as a worktree, observed the same ignore rules (both locally
defined as well as from `$GIT_DIR/info/exclude`).

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-10-21 18:27:07 +00:00
Pieter Noordhuis eefda8c198
Fix path to repository-wide exclude file (#1837)
## Changes

This file is at `info/exclude`, and not `info/excludes`.

Also see https://git-scm.com/docs/gitignore.

## Tests

Manually confirmed that these ignore patterns are now picked up. I
created a repository with a pattern in this file and ran `sync` to
confirm it ignores files matching the pattern.
2024-10-18 15:46:39 +00:00
shreyas-goenka ac6b80ed88
Remove user credentials specified in the Git origin URL (#1494)
## Changes
We set the origin URL as metadata in any jobs created by DABs. This PR
makes sure user credentials do not leak into the set metadata in the
job.
 
## Tests
Unit test

---------

Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
2024-06-17 09:49:00 +00:00
Pieter Noordhuis c9b4f11947
Update error checks that use the `os` package to use `errors.Is` (#1461)
## Changes

From the [documentation](https://pkg.go.dev/os#IsNotExist) on the
functions in the `os` package:
> This function predates errors.Is. It only supports errors returned by
the os package.
> New code should use errors.Is(err, fs.ErrNotExist).

This issue surfaced while working on using a different `vfs.Path`
implementation that uses errors from the `fs` package. Calls to
`os.IsNotExist` didn't return true for errors that wrap
`fs.ErrNotExist`.

## Tests

n/a
2024-06-03 12:39:36 +00:00
Pieter Noordhuis 424499ec1d
Abstract over filesystem interaction with libs/vfs (#1452)
## Changes

Introduce `libs/vfs` for an implementation of `fs.FS` and friends that
_includes_ the absolute path it is anchored to.

This is needed for:
1. Intercepting file operations to inject custom logic (e.g., logging,
access control).
2. Traversing directories to find specific leaf directories (e.g.,
`.git`).
3. Converting virtual paths to OS-native paths.

Options 2 and 3 are not possible with the standard `fs.FS` interface.
They are needed such that we can provide an instance to the sync package
and still detect the containing `.git` directory and convert paths to
native paths.

This change focuses on making the following packages use `vfs.Path`:
* libs/fileset
* libs/git
* libs/sync

All entries returned by `fileset.All` are now slash-separated. This has
2 consequences:
* The sync snapshot now always uses slash-separated paths
* We don't need to call `filepath.FromSlash` as much as we did

## Tests

* All unit tests pass
* All integration tests pass
* Manually confirmed that a deployment made on Windows by a previous
version of the CLI can be deployed by a new version of the CLI while
retaining the validity of the local sync snapshot as well as the remote
deployment state.
2024-05-30 07:41:50 +00:00
Pieter Noordhuis 8e58e04e8f
Move folders package into libs (#1184)
## Changes

This is the last top-level package that doesn't need to be top-level.
2024-02-07 16:33:18 +00:00
shreyas-goenka 5a8cd0c5bc
Persist deployment metadata in WSFS (#845)
## Changes

This PR introduces a metadata struct that stores a subset of bundle
configuration that we wish to expose to other Databricks services that
wish to integrate with bundles.

This metadata file is uploaded to a file
`${bundle.workspace.state_path}/metadata.json` in the WSFS destination
of the bundle deployment.

Documentation for emitted metadata fields:
* `version`: Version for the metadata file schema
* `config.bundle.git.branch`: Name of the git branch the bundle was
deployed from.
* `config.bundle.git.origin_url`: URL for git remote "origin"
* `config.bundle.git.bundle_root_path`: Relative path of the bundle root
from the root of the git repository. Is set to "." if they are the same.
* `config.bundle.git.commit`: SHA-1 commit hash of the exact commit this
bundle was deployed from. Note, the deployment might not exactly match
this commit version if there are changes that have not been committed to
git at deploy time,
* `file_path`: Path in workspace where we sync bundle files to. 
* `resources.jobs.[job-ref].id`: Id of the job
* `resources.jobs.[job-ref].relative_path`: Relative path of the yaml
config file from the bundle root where this job was defined.

Example metadata object when bundle root and git root are the same:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "master",
        "origin_url": "www.host.com",
        "commit": "7af8e5d3f5dceffff9295d42d21606ccf056dce0",
        "bundle_root_path": "."
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```

Example metadata when the git root is one level above the bundle repo:
```json
{
  "version": 1,
  "config": {
    "bundle": {
      "lock": {},
      "git": {
        "branch": "dev-branch",
        "origin_url": "www.my-repo.com",
        "commit": "3db46ef750998952b00a2b3e7991e31787e4b98b",
        "bundle_root_path": "pipeline-progress"
      }
    },
    "workspace": {
      "file_path": "/Users/shreyas.goenka@databricks.com/.bundle/pipeline-progress/default/files"
    },
    "resources": {
      "jobs": {
        "bar": {
          "id": "245921165354846",
          "relative_path": "databricks.yml"
        }
      }
    },
    "sync": {}
  }
}
```


This unblocks integration to the jobs break glass UI for bundles.

## Tests
Unit tests and integration tests.
2023-10-27 12:55:43 +00:00
Pieter Noordhuis c25bc041b1
Never ignore root directory when enumerating files in a repository (#683)
## Changes

The pattern `.*` in a `.gitignore` file can match `.` when walking all
files in a repository. If it does, then the walker immediately aborts
and no files are returned. The root directory (an unnamed directory)
must never be ignored.

Reported in https://github.com/databricks/databricks-vscode/issues/837.

## Tests

New tests pass.
2023-08-21 07:35:02 +00:00
shreyas-goenka d6f626912f
Fix bundle git branch validation (#645)
## Changes
This PR:
1. Fixes the computation logic for `ActualBranch`. An error in the
earlier logic caused the validation mutator to be a no-op.
2. Makes the `.git` string a global var. This is useful to configure in
tests.
3. Adds e2e test for the validation mutator.

## Tests
Unit test
2023-08-07 17:29:02 +00:00
Pieter Noordhuis 98ebb78c9b
Rename bricks -> databricks (#389)
## Changes

Rename all instances of "bricks" to "databricks".

## Tests

* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.
2023-05-16 18:35:39 +02:00
shreyas-goenka 9e16140b6e
Add git config block to bundle config (#356)
## Changes
This config block contains commit, branch and remote_url which will be
automatically loaded if specified in the repo, and can also be specified
by the user

## Tests
Unit and black-box tests
2023-04-26 16:54:36 +02:00
shreyas-goenka 902813a490
Hardcode `.databricks` ignore pattern to ensure we never sync the cache directory (#295)
## Changes
<!-- Summary of your changes that are easy to understand -->
1. Add pattern to always ignore .databricks
2. Best effort creation of .gitignore with .databricks if it's needed

## Tests
<!-- How is this tested? -->
2023-04-04 15:44:57 +02:00
Pieter Noordhuis 8af934bbbb
Function to find the Git repository containing a bundle (#289)
## Changes

Useful functions from #277.

## Tests

Tests pass.
2023-03-29 16:36:35 +02:00
Pieter Noordhuis abb1de99ba
Locate and use global excludes file (#191)
This implements rudimentary gitconfig loading as specified at
https://git-scm.com/docs/git-config.
2023-02-02 12:25:53 +01:00
Pieter Noordhuis 241562e2b1
Move git package to libs/git (#189)
Fixes #185.
2023-01-31 19:19:16 +01:00