SVM-light-TK を使ってみる
SVM-light-TK は Alessandro Moschitti が開発した Tree Kernel が使える SVM ライブラリ。h現在のバージョンは1.2.1。
http://disi.unitn.it/moschitti/TK1.2-software/download.html から軽いアンケートみたいなものをするとダウンロードできる。今回はサンプルデータが公式サイトにあるのでそれを使ってみる。
% cd svm-light-TK-1.2.1 % ./svm_learn -t 5 arg0.train trained-arg0 Scanning examples...done Reading examples into memory...100..OK. (112 examples read) Number of examples: 112, linear space size: 21478 estimating ... Setting default regularization parameter C=1.0000 Optimizing........................................done. (41 iterations) Optimization finished (3 misclassified, maxdiff=0.00100). Runtime in cpu-seconds: 0.01 Number of SV: 91 (including 26 at upper bound) L1 loss: loss=11.55698 Norm of weight vector: |w|=6.88679 Norm of longest example vector: |x|=1.00000 Estimated VCdim of classifier: VCdim<=48.42783 Computing XiAlpha-estimates...done Runtime for XiAlpha-estimates in cpu-seconds: 0.00 XiAlpha-estimate of the error: error<=23.21% (rho=1.00,depth=0) XiAlpha-estimate of the recall: recall=>75.00% (rho=1.00,depth=0) XiAlpha-estimate of the precision: precision=>77.78% (rho=1.00,depth=0) Number of kernel evaluations: 9711 Writing model file...done % ./svm_classify tk1.2-arg/arg0.test model Reading model...OK. (92 support vectors read) Classifying test examples..100..done Runtime (without IO) in cpu-seconds: 0.01 Accuracy on test set: 83.04% (93 correct, 19 incorrect, 112 total) Precision/recall on test set: 84.91%/80.36%
svm_learn 時の -t でカーネルを指定している。5は部分木の組み合わせ(?) 他のオプションで細かく指定しているようだ。
5: combination of forest and vector sets according to W, V, S, C options
linear kernel とも比較してみると
% ./svm_learn -t 1 tk1.2-arg/arg0.test trained-arg0 ... % ./svm_classify tk1.2-arg/arg0.train trained-arg0 Reading model...OK. (104 support vectors read) Classifying test examples..100..done Runtime (without IO) in cpu-seconds: 0.00 Accuracy on test set: 76.99% (87 correct, 26 incorrect, 113 total) Precision/recall on test set: 70.83%/91.07%
15%も精度に違いができた。