4.8.14 Train Vector Classifier

Train a classifier based on labeled geometries and a list of features to consider.

Detailed description

This application trains a classifier based on labeled geometries and a list of features to consider for classification.

Parameters

This section describes in details the parameters available for this application. Table 4.140, page 773 presents a summary of these parameters and the parameters keys to be used in command-line and programming languages. Application key is TrainVectorClassifier.





Parameter key

Parameter type

Parameter description




io

Group

Input and output data

io.vd

Input vector data

Input Vector Data

io.stats

Input File name

Input XML image statistics file

io.confmatout

Output File name

Output confusion matrix

io.out

Output File name

Output model

feat

List

Field names for training features.

cfield

String

Field containing the class id for supervision

layer

Int

Layer Index

valid

Group

Validation data

valid.vd

Input vector data

Validation Vector Data

valid.layer

Int

Layer Index

classifier

Choices

Classifier to use for the training

classifier libsvm

Choice

LibSVM classifier

classifier boost

Choice

Boost classifier

classifier dt

Choice

Decision Tree classifier

classifier gbt

Choice

Gradient Boosted Tree classifier

classifier ann

Choice

Artificial Neural Network classifier

classifier bayes

Choice

Normal Bayes classifier

classifier rf

Choice

Random forests classifier

classifier knn

Choice

KNN classifier

classifier.libsvm.k

Choices

SVM Kernel Type

classifier.libsvm.k linear

Choice

Linear

classifier.libsvm.k rbf

Choice

Gaussian radial basis function

classifier.libsvm.k poly

Choice

Polynomial

classifier.libsvm.k sigmoid

Choice

Sigmoid

classifier.libsvm.m

Choices

SVM Model Type

classifier.libsvm.m csvc

Choice

C support vector classification

classifier.libsvm.m nusvc

Choice

Nu support vector classification

classifier.libsvm.m oneclass

Choice

Distribution estimation (One Class SVM)

classifier.libsvm.c

Float

Cost parameter C

classifier.libsvm.opt

Boolean

Parameters optimization

classifier.libsvm.prob

Boolean

Probability estimation

classifier.boost.t

Choices

Boost Type

classifier.boost.t discrete

Choice

Discrete AdaBoost

classifier.boost.t real

Choice

Real AdaBoost (technique using confidence-rated predictions and working well with categorical data)

classifier.boost.t logit

Choice

LogitBoost (technique producing good regression fits)

classifier.boost.t gentle

Choice

Gentle AdaBoost (technique setting less weight on outlier data points and, for that reason, being often good with regression data)

classifier.boost.w

Int

Weak count

classifier.boost.r

Float

Weight Trim Rate

classifier.boost.m

Int

Maximum depth of the tree

classifier.dt.max

Int

Maximum depth of the tree

classifier.dt.min

Int

Minimum number of samples in each node

classifier.dt.ra

Float

Termination criteria for regression tree

classifier.dt.cat

Int

Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split

classifier.dt.f

Int

K-fold cross-validations

classifier.dt.r

Boolean

Set Use1seRule flag to false

classifier.dt.t

Boolean

Set TruncatePrunedTree flag to false

classifier.gbt.w

Int

Number of boosting algorithm iterations

classifier.gbt.s

Float

Regularization parameter

classifier.gbt.p

Float

Portion of the whole training set used for each algorithm iteration

classifier.gbt.max

Int

Maximum depth of the tree

classifier.ann.t

Choices

Train Method Type

classifier.ann.t reg

Choice

RPROP algorithm

classifier.ann.t back

Choice

Back-propagation algorithm

classifier.ann.sizes

String list

Number of neurons in each intermediate layer

classifier.ann.f

Choices

Neuron activation function type

classifier.ann.f ident

Choice

Identity function

classifier.ann.f sig

Choice

Symmetrical Sigmoid function

classifier.ann.f gau

Choice

Gaussian function (Not completely supported)

classifier.ann.a

Float

Alpha parameter of the activation function

classifier.ann.b

Float

Beta parameter of the activation function

classifier.ann.bpdw

Float

Strength of the weight gradient term in the BACKPROP method

classifier.ann.bpms

Float

Strength of the momentum term (the difference between weights on the 2 previous iterations)

classifier.ann.rdw

Float

Initial value Delta_0 of update-values Delta_ij in RPROP method

classifier.ann.rdwm

Float

Update-values lower limit Delta_min in RPROP method

classifier.ann.term

Choices

Termination criteria

classifier.ann.term iter

Choice

Maximum number of iterations

classifier.ann.term eps

Choice

Epsilon

classifier.ann.term all

Choice

Max. iterations + Epsilon

classifier.ann.eps

Float

Epsilon value used in the Termination criteria

classifier.ann.iter

Int

Maximum number of iterations used in the Termination criteria

classifier.rf.max

Int

Maximum depth of the tree

classifier.rf.min

Int

Minimum number of samples in each node

classifier.rf.ra

Float

Termination Criteria for regression tree

classifier.rf.cat

Int

Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split

classifier.rf.var

Int

Size of the randomly selected subset of features at each tree node

classifier.rf.nbtrees

Int

Maximum number of trees in the forest

classifier.rf.acc

Float

Sufficient accuracy (OOB error)

classifier.knn.k

Int

Number of Neighbors

rand

Int

set user defined seed

inxml

XML input parameters file

Load otb application from xml file

outxml

XML output parameters file

Save otb application to xml file











Table 4.140: Parameters table for Train Vector Classifier.

Input and output data This group of parameters allows setting input and output data.

Field names for training features. List of field names in the input vector data to be used as features for training.

Field containing the class id for supervision Field containing the class id for supervision. Only geometries with this field available will be taken into account.

Layer Index Index of the layer to use in the input vector file.

Validation data This group of parameters defines validation data.

Classifier to use for the training Choice of the classifier to use for the training. Available choices are:

set user defined seed Set specific seed. with integer value.

Load otb application from xml file Load otb application from xml file

Save otb application to xml file Save otb application to xml file

Example

To run this example in command-line, use the following:

otbcli_TrainVectorClassifier -io.vd vectorData.shp -io.stats meanVar.xml -io.out svmModel.svm -feat perimeter  area  width -cfield predicted

To run this example from Python, use the following code snippet:

#!/usr/bin/python 
 
# Import the otb applications package 
import otbApplication 
 
# The following line creates an instance of the TrainVectorClassifier application 
TrainVectorClassifier = otbApplication.Registry.CreateApplication("TrainVectorClassifier") 
 
# The following lines set all the application parameters: 
TrainVectorClassifier.SetParameterString("io.vd", "vectorData.shp") 
 
TrainVectorClassifier.SetParameterString("io.stats", "meanVar.xml") 
 
TrainVectorClassifier.SetParameterString("io.out", "svmModel.svm") 
 
# The following line execute the application 
TrainVectorClassifier.ExecuteAndWriteOutput()

Authors

This application has been written by OTB Team.