Bootstrap method for robust inference
The paper titled, Regression analysis with many specifications, uses stationary bootstrap method to evaluate a large set of models.
In a typical data mining set up, the problem of choosing the number of covariates can be handled in many ways such as

Best subset selection

Forward or/ and backward regression

Forward stage wise regression

Lasso

Ridge regression

Combination of Lasso and Ridge regression
In each of the above cases, the assumption is that covariates are independent. In a time series setting where regressors are lagged time series, the assumption of independence is obviously weak. Hence these methods might have limited applicability. Having said that, I guess that there are researchers working in ML/DM areas who are trying to refine methods so that broad range of techniques available inthe data mining field could be applied in econometrics. Ok,now coming back to the paper,
Let’s say you have an independent variable and a set of 101 covariates you want to regress. You somehow feel that at max three regressors should be enough. At the same time you know that a specific variable, call it X1 out of 101 variables must always appear in the model. So, the selection boils down to 100 variables. How do you go about selecting the number of regressors to include ? Should it be two/three regressors? Based on the number of regressors, what should be those regressors ?
This paper by PR Hansen is a nice paper that gives the reader a basic understanding of model selection in an univariate time series setting. The methodology followed in the paper is as follows :

Split the variables in to free variables and doubtful variables. Free variable are those that are always included as regressors in some of the candidate models considered.

Decide on the number of models, m , you want to evaluate.

Decide on the intermodel statistic to compare the models for a specific training sample. The author in this paper uses maximum R squared statistic

Use a stationary bootstrap method to generate training samples : This basically involves selecting random blocks of time series where the block lengths are geometrically distributed

For each of the bootstrap sample, fit m models and compute R square for each model. Compute intermodel statistic for each bootstrap sample

Once intermodel statistic is computed for all the bootstrap samples, you obtain the empirical distribution function. This can be used to check whether the intermodel statistic is statistically significant.
Coming back to our problem of model selection, there could “100 \choose 1” two variable models, “100 \choose 2” three variable models. To begin with we can run the above 6 step procedure for two variable models, “100 \choose 1” and check whether maximal R squared statistic is significant. Similarly one can run the above 6 step procedure for “100 \choose 2” three variable models.
A variant of this procedure can be used to select a subset of volatility models that can be considered as best performing. among a large set of volatility models.