splitdata

Splits the dataset into training and test set based on the given ratio.

Syntax

[XTrain, XTest] = splitdata(X)

[XTrain, XTest] = splitdata(X,options)

[XTrain, XTest, yTrain, yTest] = splitdata(...)

Inputs

X
Input features of the dataset.
Type: double
Dimension: vector | matrix
options
Type: struct
y
Output feature of the dataset. yTrain, yTest are available as outputs if y is set.
Type: double
Dimension: vector | matrix
test_ratio
Specifies what percentage of data need to be allocated for test set. Value should be between 0 and 1 (default: 0.1).
Type: double
Dimension: scalar
shuffle
Value that states whether to shuffle the data before splitting happens.
Type: Boolean
Dimension: logical
seed
Seed value to be used for shuffling. It helps in reproducibility of the train test split.
Type: double | integer
Dimension: scalar

Outputs

XTrain
Input features of training set.
Type: double
Dimension: matrix
XTest
Input features of test set.
Type: double
Dimension: matrix
yTrain
Output features of training set.
Type: double
Dimension: matrix
yTest
Output features of test set.
Type: double
Dimension: matrix

Example

Usage of splitdata

X = rand(2500, 32);
y = rand(2500, 1);
options = struct;
options.y = y;
options.test_ratio = 0.3;
[XTrain, XTest, yTrain, yTest] = splitdata(X, options);
> size(X)
ans = [Matrix] 1 x 2
2500 32
> size(XTrain)
ans = [Matrix] 1 x 2
1750 32
> size(XTest)
ans = [Matrix] 1 x 2
750 32
>