logisticfit

Logistic Regression classifier.

Syntax

parameters = logisticfit(X,y)

parameters = logisticfit(X,y,options)

Inputs

X
Training data.
Type: double
Dimension: vector | matrix
y
Target values.
Type: double
Dimension: vector | matrix
options
Type: struct
penalty
Used to specify the norm used in the penalization. The 'newton-cg', 'sag' and 'lbfgs' solvers support only l2 (default) penalties.
Type: char
Dimension: string
dual
Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual = false (default) when n_samples > n_features.
Type: logical
Dimension: Boolean
tol
Tolerance for stopping criteria (default: 1e-4).
Type: double
Dimension: scalar
C
Inverse of regularization strength. It must be a positive float (default: 1). Like in Support Vector Machines, smaller values specify stronger regularization.
Type: double
Dimension: scalar
random_state
Used when solver = 'sag' or 'liblinear' to shuffle the data.
Type: integer
Dimension: scalar
solver
Algorithm used to use in the optimization problem.Allowed solvers: 'newton-cg', 'lbfgs' (default), 'liblinear', 'sag', 'saga'
Type: char
Dimension: string
max_iter
Useful only for newton-cg, sag and lbfgs solvers. Maximum number of iterations taken for the solvers to converge (default: 100).
Type: integer
Dimension: scalar
multi_class
If 'ovr' is chosen, then a binary problem is fit for each label. If 'multinomial', then the loss minimized is the multinomial loss fit across the entire probability distribution, even when the data is binary. 'multinomial' is unavailable when solver = 'liblinear'. 'auto' (default) selects 'ovr' if the data is binary, or if solver = 'liblinear', and otherwise selects 'multinomial'.
Type: char
Dimension: string

Outputs

parameters
Contains all the values passed to logisticfit method as options. Additionally it has below key-value pairs.
Type: struct
scorer
Function handle pointing to 'accuracy' function.
Type: function handle
intercept
Intercept (bias) added to the decision function.
Type: double
Dimension: scalar
coef
Coefficient of the features in the decision function.
Type: double
Dimension: cell
classes
A list of class labels known to the classifier.
Type: double
Dimension: matrix
n_samples
Number of rows in the training data.
Type: integer
Dimension: scalar
n_features
Number of columns in the training data.
Type: integer
Dimension: scalar

Example

Usage of logisticfit

data 	= dlmread('banknote_authentication.txt', ',');
X 		= data(:, 1:2);
y 		= data(:, end);

parameters 	= logisticfit(X, y)
parameters = struct [
  C: 1
  classes: [Matrix] 1 x 2
  0  1
  coef: [Matrix] 1 x 2
  -1.10461  -0.27259
  dual: 0
  intercept: 0.812773588
  max_iter: 100
  multi_class: auto
  n_features: 2
  n_samples: 1372
  params: [Matrix] 1 x 3
  0.81277  -1.10461  -0.27259
  penalty: l2
  solver: lbfgs
  tol: 0.0001
]

Comments

Output 'parameters' should be passed as input to logisticpredict function. For small datasets, 'liblinear' is a good choice, whereas 'sag' and 'saga' are faster for large ones. For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to one-versus-rest schemese. 'newton-cg', 'lbfgs' and 'sag' only handle L2 penalty, whereas 'liblinear' and 'saga' handle L1 penalty. Note that 'sag' and 'saga' fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data using preprocessing methods.