使用Matlab进行特征选择

收录时间:2014-03-12
资源分类:Matlab 工具:MATLAB 7.5 (R2007b)

这个代码包括5个特征选择算法:

(1)序列前向选择( SFS , Sequential Forward Selection )

算法描述:特征子集X从空集开始,每次选择一个特征x加入特征子集X,使得特征函数J( X)最优。简单说就是,每次都选择一个使得评价函数的取值达到最优的特征加入,其实就是一种简单的贪心算法。

算法评价:缺点是只能加入特征而不能去除特征。例如:特征A完全依赖于特征B与C,可以认为如果加入了特征B与C则A就是多余的。假设序列前向选择算法首先将A加入特征集,然后又将B与C加入,那么特征子集中就包含了多余的特征A。


(2)序列后向选择( SBS , Sequential Backward Selection )

算法描述:从特征全集O开始,每次从特征集O中剔除一个特征x,使得剔除特征x后评价函数值达到最优。

算法评价:序列后向选择与序列前向选择正好相反,它的缺点是特征只能去除不能加入。

另外,SFS与SBS都属于贪心算法,容易陷入局部最优值。


(3) 双向搜索( BDS , Bidirectional Search )

算法描述:使用序列前向选择(SFS)从空集开始,同时使用序列后向选择(SBS)从全集开始搜索,当两者搜索到一个相同的特征子集C时停止搜索。

双向搜索的出发点是  。如下图所示,O点代表搜索起点,A点代表搜索目标。灰色的圆代表单向搜索可能的搜索范围,绿色的2个圆表示某次双向搜索的搜索范围,容易证明绿色的面积必定要比灰色的要小。

(4)序列浮动选择( Sequential Floating Selection )

算法描述:序列浮动选择由增L去R选择算法发展而来,该算法与增L去R选择算法的不同之处在于:序列浮动选择的L与R不是固定的,而是“浮动”的,也就是会变化的。

序列浮动选择根据搜索方向的不同,有以下两种变种。


<1>序列浮动前向选择( SFFS , Sequential Floating Forward Selection )


算法描述:从空集开始,每轮在未选择的特征中选择一个子集x,使加入子集x后评价函数达到最优,然后在已选择的特征中选择子集z,使剔除子集z后评价函数达到最优。


<2>序列浮动后向选择( SFBS , Sequential Floating Backward Selection )


算法描述:与SFFS类似,不同之处在于SFBS是从全集开始,每轮先剔除特征,然后加入特征。

           算法评价:序列浮动选择结合了序列前向选择、序列后向选择、增L去R选择的特点,并弥补了它们的缺点。

 

Feature Selection using Matlab

The DEMO includes 5 feature selection algorithms:

• Sequential Forward Selection (SFS)

• Sequential Floating Forward Selection (SFFS)

• Sequential Backward Selection (SBS)

• Sequential Floating Backward Selection (SFBS)

• ReliefF

 

Two CCR estimation methods:

• Cross-validation

• Resubstitution

 

After selecting the best feature subset, the classifier obtained can be used for classifying any pattern.

 

 Figure: Upper panel is the pattern x feature matrix

             Lower panel left are the features selected

             Lower panel right is the CCR curve during feature selection steps

             Right panel is the classification results of some patterns.

 

This software was developed using Matlab 7.5 and Windows XP.

 

Copyright: D. Ververidis and C.Kotropoulos

                 AIIA Lab, Thessaloniki, Greece,

                 jimver@aiia.csd.auth.gr

                 costas@aiia.csd.auth.gr

 

In order to run the DEMO:

 

In order to run the demo:

- A PC with Windows XP is needed.

- Use Matlab7.5 or later to run DEMO.m

 

1) Select the ‘finalvec.mat’ dataset (patterns x [features+1] matrix) from 'PatTargMatrices' folder. The last column of ‘finalvec.mat’ are the targets.

2) Press the run button on the panel. It is the second one.

3) After the selection of the optimum feature set, select a set of patterns for classification using the open folder button (last button). It can be the same data-set that was used for training the feature selection algorithm

 

% REFERENCES:

[1] D. Ververidis and C. Kotropoulos, "Fast and accurate feature subset selection applied into speech emotion recognition," Els. Signal Process., vol. 88, issue 12, pp. 2956-2970, 2008.

[2] D. Ververidis and C. Kotropoulos, "Information loss of the Mahalanobis distance in high dimensions: Application to feature selection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2275-2281, 2009.

文件下载列表
Version_5.zip (3.14MB)  
附件内容(只显示31中的10个)
BackSel_main.m  BayesClassMVGaussPDFs.m  BayesClassValidationSet.m  CalcInfoLoss.m  CCRForOptSet.m  DataLoadAndPreprocess.m  DEMO.m  ForwSel_main.m  LoadFromTXTfiles.m  ReliefF.m  
标签: 特征选择 
更多

目前尚无评论

用户反馈   关于我们
Copyright (©) ZHIHUISHI.COM 2013 All Rights Reserved.
京ICP备18060134号-2