documentation

2019-12-31 23:34:04 -06:00 · 2019-12-31 23:34:04 -06:00 · df47ed4cb2
parent 685c567661
commit df47ed4cb2
1 changed files with 20 additions and 16 deletions
--- a/README.md
+++ b/README.md
@ -12,32 +12,33 @@ This package is designed to generate synthetic data from a dataset from an origi
 ## Usage
 After installing the easiest way to get started is as follows (using pandas). The process is as follows:
-1. Train the GAN on the original/raw dataset
+
 **Train the GAN on the original/raw dataset**
-import pandas as pd
+    import pandas as pd
-import data.maker
+    import data.maker
-df      = pd.read_csv('sample.csv')
+    df      = pd.read_csv('sample.csv')
-column  = 'gender'
+    column  = 'gender'
-id      = 'id' 
+    id      = 'id' 
-context = 'demo'
+    context = 'demo'
-data.maker.train(context=context,data=df,column=column,id=id,logs='logs')
+    data.maker.train(context=context,data=df,column=column,id=id,logs='logs')
 The trainer will store the data on disk (for now) in a structured folder that will hold training models that will be used to generate the synthetic data.
-2. Generate a candidate dataset from the learnt features
+**Generate a candidate dataset from the learned features**
-import pandas as pd
+        import pandas as pd
-import data.maker
+        import data.maker
-df  = pd.read_csv('sample.csv')
+        df  = pd.read_csv('sample.csv')
-id  = 'id'
+        id  = 'id'
-column = 'gender'
+        column = 'gender'
-context = 'demo'
+        context = 'demo'
-data.maker.generate(data=df,id=id,column=column,logs='logs')
+        data.maker.generate(data=df,id=id,column=column,logs='logs')
 ## Limitations
@ -46,11 +47,14 @@ GANS will generate data assuming the original data has all the value space neede
 - No new data will be created
        Assuming we have a dataset with an gender attribute with values [M,F]. 
        The synthetic data will not be able to generate genders outside [M,F]
 - Not advised on continuous values
        GANS work well on discrete values and thus are not advised to be used.
        e.g:measurements (height, blood pressure, ...)
 - For now will only perform on a single feature.
 ## Credits :