## Changes
The workspace extensions filer should not read or stat a notebook called
`foo` if the user calls `.Stat(ctx, "foo")`.
Instead, the filer should return a file not found error. This is because
the contract for the workspace extensions filer is to only work for
notebooks when the file path / name includes the extension (example:
`foo.ipynb` or `foo.sql` instead of just `foo`)
## Tests
Integration tests.
## Changes
### Background
The workspace import APIs recently added support for importing Jupyter
notebooks written in R, Scala, or SQL, that is non-Python notebooks.
This now works for the `/import-file` API which we leverage in the CLI.
Note: We do not need any changes in `databricks sync`. It works out of
the box because any state mapping of local names to remote names that we
store is only scoped to the notebook extension (i.e., `.ipynb` in this
case) and is agnostic of the notebook's specific language.
### Problem this PR addresses
The extension-aware filer previously did not function because it checks
that a `.ipynb` notebook is written in Python. This PR relaxes that
constraint and adds integration tests for both the normal workspace
filer and extensions aware filer writing and reading non-Python `.ipynb`
notebooks.
This implies that after this PR DABs in the workspace / CLI from DBR
will work for non-Python notebooks as well. non-Python notebooks for
DABs deployment from local machines already works after the platform
side changes to the API landed, this PR just adds integration tests for
that bit of functionality.
Note: Any platform side changes we needed for the import API have
already been rolled out to production.
### Before
DABs deploy would work fine for non-Python notebooks. But DABs
deployments from DBR would not.
### After
DABs deploys both from local machines and DBR will work fine.
## Testing
For creating the `.ipynb` notebook fixtures used in the integration
tests I created them directly from the VSCode UI. This ensures high
fidelity with how users will create their non-Python notebooks locally.
For Python notebooks this is supported out of the box by VSCode but for
R and Scala notebooks this requires installing the Jupyter kernel for R
and Scala on my local machine and using that from VSCode.
For SQL, I ended up directly modifying the `language_info` field in the
Jupyter metadata to create the test fixture.
### Discussion: Issues with configuring language at the cell level
The language metadata for a Jupyter notebook is standardized at the
notebook level (in the `language_info` field). Unfortunately, it's not
standardized at the cell level. Thus, for example, if a user changes the
language for their cell in VSCode (which is supported by the standard
Jupyter VSCode integration), it'll cause a runtime error when the user
actually attempts to run the notebook. This is because the cell-level
metadata is encoded in a format specific to VSCode:
```
cells: []{
"vscode": {
"languageId": "sql"
}
}
```
Supporting cell level languages is thus out of scope for this PR and can
be revisited along with the workspace files team if there's strong
customer interest.
## Changes
`TestAccFilerWorkspaceFilesExtensionsErrorsOnDupName` recently started
failing in our nightlies because the upstream `import` API was changed
to [prohibit conflicting file
paths](https://docs.databricks.com/en/release-notes/product/2024/august.html#files-can-no-longer-have-identical-names-in-workspace-folders).
Because existing conflicting file paths can still be grandfathered in,
we need to retain coverage for the test. To do this, this PR:
1. Removes the failing
`TestAccFilerWorkspaceFilesExtensionsErrorsOnDupName`
2. Add an equivalent unit test with the `list` and `get-status` API
calls mocked.
## Changes
This worked fine if the notebooks are located in the filer's root and
didn't if they are nested in a directory.
This change adds test coverage and fixes the underlying issue.
## Tests
Ran integration test manually.
## Changes
This PR adds a filer that'll allow us to read notebooks from the WSFS
using their full paths (with the extension included). The filer relies
on the existing workspace filer (and consequently the workspace
import/export/list APIs).
Using this filer along with a virtual filesystem layer
(https://github.com/databricks/cli/pull/1452/files) will allow us to use
our custom implementation (which preserves the notebook extensions)
rather than the default mount available via DBR when the CLI is run from
DBR.
## Tests
Integration tests.
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Changes
```
shreyas.goenka@THW32HFW6T cli % databricks fs -h
Commands to do file system operations on DBFS and UC Volumes.
Usage:
databricks fs [command]
Available Commands:
cat Show file content.
cp Copy files and directories.
ls Lists files.
mkdir Make directories.
rm Remove files and directories.
```
This PR adds support for UC Volumes to the fs commands. The fs commands
for UC volumes work the same as they currently do for DBFS. This is
ensured by running the same test matrix we across both DBFS and UC
Volumes versions of the fs commands.
## Tests
Support for UC volumes is tested by running the same tests as we did
originally for DBFS commands. The tests require a `main` catalog to
exist in the workspace, which does in our test workspaces environments
which have the `TEST_METASTORE_ID` environment variable set.
For the Files API filer, we do the same by running mostly common tests
to ensure the filers for "local", "wsfs", "dbfs" and "files API" are
consistent.
The tests are also made to all run in parallel to reduce the time taken.
To ensure the separation of the tests, each test creates its own UC
schema (for UC volumes tests) or DBFS directories (for DBFS tests).
## Changes
These tests allow us to get information for execution context
(PYTHONPATH, CWD) for various Python tasks and different cluster setups.
Note: this test won't be executed automatically as part of nightly
builds since it requires RUN_PYTHON_TASKS_TEST env to be executed.
## Tests
Integration test run successfully.
---------
Co-authored-by: Pieter Noordhuis <pieter.noordhuis@databricks.com>
## Tests
New integration test for the read/write parts of the other filers. The
integration test cannot be shared just yet because the Files API doesn't
include support for creating/listing/removing directories yet.
## Changes
Local file reads on Windows require the file handle to be closed after
using it. This commit includes an interface change to return an
`io.ReadCloser` from `Read` to accommodate this.
## Tests
The existing integration tests for the filer interface all pass.
## Changes
This PR:
1. Adds the export-dir command
2. Changes filer.Read to return an error if a user tries to read a
directory
3. Adds returning internal file structures from filer.Stat().Sys()
## Tests
Integration tests and manually
## Changes
This captures the recursive deletion of a directory tree in the filer interface.
Prompted by #433.
## Tests
Integration tests pass (ran the filer ones on AWS and Azure).
## Changes
The pattern `errors.Is(err, fs.ErrNotExist)` is common to check for an
error type.
Errors can implement `Is(error) bool` with a custom equivalence checker.
## Tests
New asserts all pass in the integration test.
Adds a DBFS implementation of the `filer.Filer` interface.
The integration tests are reused between the workspace filesystem and
DBFS implementations to ensure identical behavior.
## Changes
Rename all instances of "bricks" to "databricks".
## Tests
* Confirmed the goreleaser build works, uses the correct new binary
name, and produces the right archives.
* Help output is confirmed to be correct.
* Output of `git grep -w bricks` is minimal with a couple changes
remaining for after the repository rename.