SampleAugmentation - Sample Augmentation¶
Generates synthetic samples from a sample data file.
Detailed description¶
The application takes a sample data file as generated by the SampleExtraction application and generates synthetic samples to increase the number of available samples.
Parameters¶
This section describes in details the parameters available for this application. Table [1] presents a summary of these parameters and the parameters keys to be used in command-line and programming languages. Application key is SampleAugmentation .
[1] | Table: Parameters table for Sample Augmentation. |
Parameter Key | Parameter Name | Parameter Type |
---|---|---|
in | Input samples | Input File name |
out | Output samples | Output File name |
field | Field Name | List |
layer | Layer Index | Int |
label | Label of the class to be augmented | Int |
samples | Number of generated samples | Int |
exclude | Field names for excluded features. | List |
strategy | Augmentation strategy | Choices |
strategy replicate | Replicate input samples | Choice |
strategy jitter | Jitter input samples | Choice |
strategy smote | Smote input samples | Choice |
strategy.jitter.stdfactor | Factor for dividing the standard deviation of each feature | Float |
strategy.smote.neighbors | Number of nearest neighbors. | Int |
seed | set user defined seed | Int |
inxml | Load otb application from xml file | XML input parameters file |
outxml | Save otb application to xml file | XML output parameters file |
Input samples: Vector data file containing samples (OGR format).
Output samples: Output vector data file storing new samples(OGR format).
Field Name: Name of the field carrying the class name in the input vectors.
Layer Index: Layer index to read in the input vector file.
Label of the class to be augmented: Label of the class of the input file for which new samples will be generated.
Number of generated samples: Number of synthetic samples that will be generated.
Field names for excluded features.: List of field names in the input vector data that will not be generated in the output file.
Augmentation strategy Available choices are:
- Replicate input samples: The new samples are generated by replicating input samples which are randomly selected with replacement.
- Jitter input samples: The new samples are generated by adding gaussian noise to input samples which are randomly selected with replacement.
- Factor for dividing the standard deviation of each feature: The noise added to the input samples will have the standard deviation of the input features divided by the value of this parameter. .
- Smote input samples: The new samples are generated by using the SMOTE algorithm (http://dx.doi.org/10.1613/jair.953) on input samples which are randomly selected with replacement.
- Number of nearest neighbors.: Number of nearest neighbors to be used in the SMOTE algorithm.
set user defined seed: Set specific seed. with integer value.
Load otb application from xml file: Load otb application from xml file.
Save otb application to xml file: Save otb application to xml file.
Example¶
To run this example in command-line, use the following:
otbcli_SampleAugmentation -in samples.sqlite -field class -label 3 -samples 100 -out augmented_samples.sqlite -exclude OGC_FID name class originfid -strategy smote -strategy.smote.neighbors 5
To run this example from Python, use the following code snippet:
#!/usr/bin/python
# Import the otb applications package
import otbApplication
# The following line creates an instance of the SampleAugmentation application
SampleAugmentation = otbApplication.Registry.CreateApplication("SampleAugmentation")
# The following lines set all the application parameters:
SampleAugmentation.SetParameterString("in", "samples.sqlite")
# The following line execute the application
SampleAugmentation.ExecuteAndWriteOutput()
Limitations¶
None
Authors¶
This application has been written by OTB-Team.