Exhaustively feature selection in scikit-learn? -
is there built in way of doing brute-force feature selection in scikit-learn ? i.e. exhaustively evaluate possible combinations of input features, , find best subset. familiar "recursive feature elimination" class interesting in evaluate possible combinations of input features 1 after other.
no, best subset selection not implemented. easiest way write yourself. should started:
from itertools import chain, combinations sklearn.cross_validation import cross_val_score def best_subset_cv(estimator, x, y, cv=3): n_features = x.shape[1] subsets = chain.from_iterable(combinations(xrange(k), k + 1) k in xrange(n_features)) best_score = -np.inf best_subset = none subset in subsets: score = cross_val_score(estimator, x[:, subset], y, cv=cv).mean() if score > best_score: best_score, best_subset = score, subset return best_subset, best_score
this performs k-fold cross-validation inside loop, fit k 2 ᵖ estimators when giving data p features.
Comments
Post a Comment