stratifiedkfold

Provides train and validation indices for cross validation while preserving the percentage of samples for each class unlike K Fold.

Syntax

output = stratifiedkfold(X,num_folds,seed)

Inputs

X
Input dataset which need to be splitted.
Type: double
Dimension: matrix
options
Type: struct
num_folds
Number of folds for splitting dataset (default: 5).
Type: integer
Dimension: scalar
shuffle
Flag to shuffle the dataset or not (default: false).
Type: Boolean
Dimension: logical
seed
Random seed used for shuffling.
Type: integer
Dimension: scalar

Outputs

output
Type: struct
folds
Cell used by 'getfold' method to retrieve folds in iterations.
Dimension: cell
num_folds
Number of folds used while splitting.
Type: integer
Dimension: scalar

Example

Usage of stratifiedkfold

X = [1 2; 3 4; 1 2; 3 4];
y = [0 0 1 1];
options.num_folds = 2; 
options.seed = 234;
options.shuffle = true;
folds = stratifiedkfold(X,y, options); 
for fold_count=1:folds.num_folds
[train_idxs, valid_idxs] = getfold(folds, fold_count);
printf('\nTRAIN: '); printf('%d ', train_idxs);
printf('\nVALID: '); printf('%d ', valid_idxs);
printf('\n=============');
end
TRAIN: 2 4 
VALID: 1 3 
=============
TRAIN: 1 3 
VALID: 2 4 
=============