data-transport/README.md

# Introduction

This project implements an abstraction of objects that can have access to a variety of data stores, implementing read/write with a simple and expressive interface. This abstraction works with **NoSQL**, **SQL** and **Cloud** data stores and leverages **pandas**.

# Why Use Data-Transport ?

Mostly data scientists that don't really care about the underlying database and would like a simple and consistent way to read/write and move data are well served. Additionally we implemented lightweight Extract Transform Loading API and command line (CLI) tool. Finally it is possible to add pre/post processing pipeline functions to read/write

1. Familiarity with **pandas data-frames**
2. Connectivity **drivers** are included
3. Reading/Writing data from various sources
4. Useful for data migrations or **ETL**


## Installation

Within the virtual environment perform the following :

    pip install git+https://github.com/lnyemba/data-transport.git


## What's new

Unlike older versions 2.0 and under, we focus on collaborative environments like jupyter-x servers; apache zeppelin:

    1. Simpler syntax to create reader or writer
    2. auth-file registry that can be referenced using a label


## Learn More

We have available notebooks with sample code to read/write against mongodb, couchdb, Netezza, PostgreSQL, Google Bigquery, Databricks, Microsoft SQL Server, MySQL ... Visit [data-transport homepage](https://healthcareio.the-phi.com/data-transport)
Update 'README.md' initialization 2017-08-07 15:21:00 +00:00			`# Introduction`
Initial commit 2017-08-07 15:06:12 +00:00
documentation 2024-04-18 04:56:31 +00:00			`This project implements an abstraction of objects that can have access to a variety of data stores, implementing read/write with a simple and expressive interface. This abstraction works with NoSQL, SQL and Cloud data stores and leverages pandas.`
documentation 2022-01-29 23:01:43 +00:00
			`# Why Use Data-Transport ?`

bug fixes and documentation 2024-04-24 18:00:03 +00:00			`Mostly data scientists that don't really care about the underlying database and would like a simple and consistent way to read/write and move data are well served. Additionally we implemented lightweight Extract Transform Loading API and command line (CLI) tool. Finally it is possible to add pre/post processing pipeline functions to read/write`
documentation 2022-01-29 23:01:43 +00:00
			`1. Familiarity with pandas data-frames`
			`2. Connectivity drivers are included`
documentation typo 2024-06-14 19:16:06 +00:00			`3. Reading/Writing data from various sources`
bug fixes and documentation 2024-04-24 18:00:03 +00:00			`4. Useful for data migrations or ETL`

Update 'README.md' initialization 2017-08-07 15:21:00 +00:00
documentation 2022-01-29 23:18:20 +00:00			`## Installation`
documentation and housekeeping work 2019-09-17 16:53:44 +00:00
documentation 2022-01-29 23:18:20 +00:00			`Within the virtual environment perform the following :`
documentation and housekeeping work 2019-09-17 16:53:44 +00:00
bug fix: mongodb read 2022-12-14 22:48:40 +00:00			`pip install git+https://github.com/lnyemba/data-transport.git`
documentation and housekeeping work 2019-09-17 16:53:44 +00:00
Update 'README.md' initialization 2017-08-07 15:21:00 +00:00
bug fix: registry and parameter handling 2024-06-19 13:38:46 +00:00			`## What's new`

			`Unlike older versions 2.0 and under, we focus on collaborative environments like jupyter-x servers; apache zeppelin:`

			`1. Simpler syntax to create reader or writer`
			`2. auth-file registry that can be referenced using a label`


documentation 2024-04-18 04:56:31 +00:00			`## Learn More`
documentation 2022-01-29 23:01:43 +00:00
bug fix: registry and parameter handling 2024-06-19 13:38:46 +00:00			`We have available notebooks with sample code to read/write against mongodb, couchdb, Netezza, PostgreSQL, Google Bigquery, Databricks, Microsoft SQL Server, MySQL ... Visit [data-transport homepage](https://healthcareio.the-phi.com/data-transport)`