playoffs 2020-03-20
来源地址:https://www.cnblogs.com/bjwu/p/9103002.html
代码:
from sklearn.feature_selection import VarianceThreshold X = [[0, 0, 1], [0, 1, 0], [1, 0, 0], [0, 1, 1], [0, 1, 0], [0, 1, 1]] sel = VarianceThreshold(threshold=(0.2) sel.fit_transform(X)
返回值过滤了方差小于0.2的特征,均方差信息为:
SelectKBest
移除那些除了评分最高的 K 个特征之外的所有特征
代码:
from sklearn.datasets import load_iris from sklearn.feature_selection import SelectKBest from sklearn.feature_selection import chi2 iris = load_iris() X, y = iris.data, iris.target X.shape X_new = SelectKBest(chi2, k=2).fit_transform(X, y) X_new.shape
wrapper-递归式特征消除(RFE)
embedded-选取特征