This project implements an abstraction of objects that can have access to a variety of data stores, implementing read/write with a simple and expressive interface. This abstraction works with **NoSQL** and **SQL** data stores and leverages **pandas**.
Once installed **data-transport** can be used as a library in code or a command line interface (CLI), as a CLI it is used for ETL and requires a configuration file.
The read/write functions make data-transport a great candidate for **data-science**; **data-engineering** or all things pertaining to data. It enables operations across multiple data-stores(relational or not)
The CLI program is called **transport** and it requires a configuration file. The program is intended to move data from one location to another. Supported data stores are in the above paragraphs.
The semantics of the attributes are provided by mongodb, please visit [mongodb documentation](https://mongodb.org/docs). In this example the file is located on _/transport/mongo.json_
_aggregateDF = mreader.read(mongo=PIPELINE) #--results of a aggregate pipeline
_collectionDF= mreader.read()
```
In order to enable write, change **context** attribute to **'read'**.
</div>
<div>
- The configuration file is in JSON format
- The commands passed to mongodb are the same as you would if you applied runCommand in mongodb
- The output is a pandas data-frame
- By default the transport reads, to enable write operations use **context='write'**
|parameters|description |
| --- | --- |
|db| Name of the database|
|port| Port number to connect to
|doc| Name of the collection of documents|
|username|Username |
|password|password|
|authSource|user database that has authentication info|
|mechanism|Mechnism used for authentication|
**NOTE**
Arguments like **db** or **doc** can be placed in the authentication file
</div>
</div>
**Limitations**
Reads and writes aren't encapsulated in the same object, this is to allow the calling code to deliberately perform actions and hopefully minimize accidents associated with data wrangling.