SampleAugmentation¶
Generates synthetic samples from a sample data file.
Description¶
The application takes a sample data file as generated by the SampleExtraction application and generates synthetic samples to increase the number of available samples.
Parameters¶
Input samples -in vectorfile
Mandatory
Vector data file containing samples (OGR format)
Output samples -out filename [dtype]
Mandatory
Output vector data file storing new samples(OGR format).
Field Name -field string
Name of the field carrying the class name in the input vectors.
Layer Index -layer int
Default value: 0
Layer index to read in the input vector file.
Label of the class to be augmented -label int
Default value: 1
Label of the class of the input file for which new samples will be generated.
Number of generated samples -samples int
Default value: 100
Number of synthetic samples that will be generated.
Field names for excluded features -exclude string1 string2...
List of field names in the input vector data that will not be generated in the output file.
Augmentation strategy -strategy [replicate|jitter|smote]
Default value: replicate
Replicate input samples
The new samples are generated by replicating input samples which are randomly selected with replacement.Jitter input samples
The new samples are generated by adding gaussian noise to input samples which are randomly selected with replacement.Smote input samples
The new samples are generated by using the SMOTE algorithm (http://dx.doi.org/10.1613/jair.953) on input samples which are randomly selected with replacement.
Jitter input samples options¶
Factor for dividing the standard deviation of each feature -strategy.jitter.stdfactor float
Default value: 10
The noise added to the input samples will have the standard deviation of the input features divided by the value of this parameter.
Smote input samples options¶
Number of nearest neighbors -strategy.smote.neighbors int
Default value: 5
Number of nearest neighbors to be used in the SMOTE algorithm
Random seed -seed int
Set a specific random seed with integer value.
Examples¶
From the command-line:
otbcli_SampleAugmentation -in samples.sqlite -field class -label 3 -samples 100 -out augmented_samples.sqlite -exclude OGC_FID name class originfid -strategy smote -strategy.smote.neighbors 5
From Python:
import otbApplication
app = otbApplication.Registry.CreateApplication("SampleAugmentation")
app.SetParameterString("in", "samples.sqlite")
app.SetParameterString("field", "class")
app.SetParameterInt("label", 3)
app.SetParameterInt("samples", 100)
app.SetParameterString("out", "augmented_samples.sqlite")
app.SetParameterStringList("exclude", ['OGC_FID', 'name', 'class', 'originfid'])
app.SetParameterString("strategy","smote")
app.SetParameterInt("strategy.smote.neighbors", 5)
app.ExecuteAndWriteOutput()