leaveoneout

Provides train and validation indices for cross validation. It’s like K Fold where K is set to the total number of records in the dataset. For larger datasets, this is not an efficient one.

Syntax

output = leaveoneout(X,options)

Inputs

X
Input dataset which need to be splitted.
Type: double
Dimension: matrix | vector
options
Type: struct
shuffle
Flag to shuffle the dataset or not (default: false).
Type: Boolean
Dimension: logical
seed
Random seed used for shuffling.
Type: integer
Dimension: scalar

Outputs

output
Type: struct
folds
Cell used by 'getfold' method to retrieve folds in iterations.
Dimension: cell
num_folds
Number of folds used while splitting.
Type: integer
Dimension: scalar

Example

Usage of leaveoneout

data = dlmread('boston_house_prices.csv', ',', 1);
X = data(:, 1:end-1);
y = data(:, end);
options.shuffle = true; 
options.seed = 34;

folds = leaveoneout(X, options);

num_folds = folds.num_folds;
val_errors = [];


for fold_count=1:num_folds
	
	[train_set_idxs, val_set_idxs] = getfold(folds, fold_count);	
	
	train_X = X(train_set_idxs, :); train_y = y(train_set_idxs, :);
	val_X = X(val_set_idxs, :); val_y = y(val_set_idxs, :);
	
	parameters = olsfit(train_X, train_y);
	predictions = olspredict(parameters, val_X);
	
	val_error = parameters.scorer(val_y, predictions);		
	val_errors = [val_errors val_error];
	
end

avg_val_error = mean(val_errors);
printf('Average Error: %f\n', avg_val_error);
Average Error: 0.676167