4.8.13 Train a regression model

Train a classifier from multiple images to perform regression.

Detailed description

This application trains a classifier from multiple input images or a csv file, in order to perform regression. Predictors are composed of pixel values in each band optionally centered and reduced using an XML statistics file produced by the ComputeImagesStatistics application.
The output value for each predictor is assumed to be the last band (or the last column for CSV files). Training and validation predictor lists are built such that their size is inferior to maximum bounds given by the user, and the proportion corresponds to the balance parameter. Several classifier parameters can be set depending on the chosen classifier. In the validation process, the mean square error is computed
This application is based on LibSVM and on OpenCV Machine Learning classifiers, and is compatible with OpenCV 2.3.1 and later.

Parameters

This section describes in details the parameters available for this application. Table 4.138, page 762 presents a summary of these parameters and the parameters keys to be used in command-line and programming languages. Application key is TrainRegression.





Parameter key

Parameter type

Parameter description




io

Group

Input and output data

io.il

Input image list

Input Image List

io.csv

Input File name

Input CSV file

io.imstat

Input File name

Input XML image statistics file

io.out

Output File name

Output regression model

io.mse

Float

Mean Square Error

sample

Group

Training and validation samples parameters

sample.mt

Int

Maximum training predictors

sample.mv

Int

Maximum validation predictors

sample.vtr

Float

Training and validation sample ratio

classifier

Choices

Classifier to use for the training

classifier libsvm

Choice

LibSVM classifier

classifier dt

Choice

Decision Tree classifier

classifier gbt

Choice

Gradient Boosted Tree classifier

classifier ann

Choice

Artificial Neural Network classifier

classifier rf

Choice

Random forests classifier

classifier knn

Choice

KNN classifier

classifier.libsvm.k

Choices

SVM Kernel Type

classifier.libsvm.k linear

Choice

Linear

classifier.libsvm.k rbf

Choice

Gaussian radial basis function

classifier.libsvm.k poly

Choice

Polynomial

classifier.libsvm.k sigmoid

Choice

Sigmoid

classifier.libsvm.m

Choices

SVM Model Type

classifier.libsvm.m epssvr

Choice

Epsilon Support Vector Regression

classifier.libsvm.m nusvr

Choice

Nu Support Vector Regression

classifier.libsvm.c

Float

Cost parameter C

classifier.libsvm.opt

Boolean

Parameters optimization

classifier.libsvm.prob

Boolean

Probability estimation

classifier.libsvm.eps

Float

Epsilon

classifier.libsvm.nu

Float

Nu

classifier.dt.max

Int

Maximum depth of the tree

classifier.dt.min

Int

Minimum number of samples in each node

classifier.dt.ra

Float

Termination criteria for regression tree

classifier.dt.cat

Int

Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split

classifier.dt.f

Int

K-fold cross-validations

classifier.dt.r

Boolean

Set Use1seRule flag to false

classifier.dt.t

Boolean

Set TruncatePrunedTree flag to false

classifier.gbt.t

Choices

Loss Function Type

classifier.gbt.t sqr

Choice

Squared Loss

classifier.gbt.t abs

Choice

Absolute Loss

classifier.gbt.t hub

Choice

Huber Loss

classifier.gbt.w

Int

Number of boosting algorithm iterations

classifier.gbt.s

Float

Regularization parameter

classifier.gbt.p

Float

Portion of the whole training set used for each algorithm iteration

classifier.gbt.max

Int

Maximum depth of the tree

classifier.ann.t

Choices

Train Method Type

classifier.ann.t reg

Choice

RPROP algorithm

classifier.ann.t back

Choice

Back-propagation algorithm

classifier.ann.sizes

String list

Number of neurons in each intermediate layer

classifier.ann.f

Choices

Neuron activation function type

classifier.ann.f ident

Choice

Identity function

classifier.ann.f sig

Choice

Symmetrical Sigmoid function

classifier.ann.f gau

Choice

Gaussian function (Not completely supported)

classifier.ann.a

Float

Alpha parameter of the activation function

classifier.ann.b

Float

Beta parameter of the activation function

classifier.ann.bpdw

Float

Strength of the weight gradient term in the BACKPROP method

classifier.ann.bpms

Float

Strength of the momentum term (the difference between weights on the 2 previous iterations)

classifier.ann.rdw

Float

Initial value Delta_0 of update-values Delta_ij in RPROP method

classifier.ann.rdwm

Float

Update-values lower limit Delta_min in RPROP method

classifier.ann.term

Choices

Termination criteria

classifier.ann.term iter

Choice

Maximum number of iterations

classifier.ann.term eps

Choice

Epsilon

classifier.ann.term all

Choice

Max. iterations + Epsilon

classifier.ann.eps

Float

Epsilon value used in the Termination criteria

classifier.ann.iter

Int

Maximum number of iterations used in the Termination criteria

classifier.rf.max

Int

Maximum depth of the tree

classifier.rf.min

Int

Minimum number of samples in each node

classifier.rf.ra

Float

Termination Criteria for regression tree

classifier.rf.cat

Int

Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split

classifier.rf.var

Int

Size of the randomly selected subset of features at each tree node

classifier.rf.nbtrees

Int

Maximum number of trees in the forest

classifier.rf.acc

Float

Sufficient accuracy (OOB error)

classifier.knn.k

Int

Number of Neighbors

classifier.knn.rule

Choices

Decision rule

classifier.knn.rule mean

Choice

Mean of neighbors values

classifier.knn.rule median

Choice

Median of neighbors values

rand

Int

set user defined seed

inxml

XML input parameters file

Load otb application from xml file

outxml

XML output parameters file

Save otb application to xml file











Table 4.138: Parameters table for Train a regression model.

Input and output data This group of parameters allows setting input and output data.

Training and validation samples parameters This group of parameters allows you to set training and validation sample lists parameters.

Classifier to use for the training Choice of the classifier to use for the training. Available choices are:

set user defined seed Set specific seed. with integer value.

Load otb application from xml file Load otb application from xml file

Save otb application to xml file Save otb application to xml file

Example

To run this example in command-line, use the following:

otbcli_TrainRegression -io.il training_dataset.tif -io.out regression_model.txt -io.imstat training_statistics.xml -classifier libsvm

To run this example from Python, use the following code snippet:

#!/usr/bin/python 
 
# Import the otb applications package 
import otbApplication 
 
# The following line creates an instance of the TrainRegression application 
TrainRegression = otbApplication.Registry.CreateApplication("TrainRegression") 
 
# The following lines set all the application parameters: 
TrainRegression.SetParameterStringList("io.il", ['training_dataset.tif']) 
 
TrainRegression.SetParameterString("io.out", "regression_model.txt") 
 
TrainRegression.SetParameterString("io.imstat", "training_statistics.xml") 
 
TrainRegression.SetParameterString("classifier","libsvm") 
 
# The following line execute the application 
TrainRegression.ExecuteAndWriteOutput()

Limitations

None

Authors

This application has been written by OTB-Team.

See also

These additional ressources can be useful for further information: