Cross-Validation

Published:

Cross-validation is a way to estimate how well a model will perform on new data by testing it multiple times on different slices of the dataset. Instead of doing one train–test split, the data is divided into several parts. The model trains on most of them and is evaluated on the part left out. This repeats until each part has taken a turn as the test set, which reduces the chance that one lucky or unlucky split gives a misleading result.

After the runs finish, teams average the results to get a more reliable performance estimate. Cross-validation is especially useful when data is limited, since it makes better use of the dataset than a single split. It’s commonly used when comparing models or tuning settings, because it can reveal overfitting early.

Follow us on Facebook and LinkedIn to keep abreast of our latest news and articles