Skip to content

Conversation

@hsinyuan-huang
Copy link
Contributor

With -n option added when executing train, It would normalize every instance so it become unit length in Euclidean norm. And the .model file would have a line of "normalization 1" in it. (if -n is not added, then it would be "normalization 0") So when you execute predict with that .model file, it will also do the normalization on the test data. (Thus the old .model file would be deprecated)

Example:
train -s 7 -n -e 1e-6 covtype.libsvm.binary

Some comparison:
train -s 0 -e 1e-6 covtype.libsvm.binary: 0m39.492s, with -n: 0m04.001s
train -s 1 -e 1e-6 covtype.libsvm.binary: 2m47.380s, with -n: 0m10.116s
train -s 2 -e 1e-6 covtype.libsvm.binary: 0m38.217s, with -n: 0m05.072s
train -s 7 -e 1e-6 covtype.libsvm.binary: 3m42.433s, with -n: 0m07.034s

train -s 1 -e 1e-6 splice.txt
predict splice.t splice.txt.model out.txt: 84.2299%, with -n: 84.9655%

train -s 1 -e 1e-6 a9a.txt
predict a9a.t a9a.txt.model out.txt: 84.9395%, with -n: 85.0132%

train -s ALPHA -e 1e-6 w1a.txt
predict w1a.t w1a.txt.model out.txt: 96.9221%, with -n: 97.6625%
ALPHA from 0 to 7: (without -n)
Accuracy = 97.2902% (45991/47272)
Accuracy = 96.8903% (45802/47272)
Accuracy = 96.9221% (45817/47272)
Accuracy = 97.1019% (45902/47272)
Accuracy = 96.8523% (45784/47272)
Accuracy = 97.3959% (46041/47272)
Accuracy = 97.5736% (46125/47272)
Accuracy = 97.2902% (45991/47272)
(with -n)
Accuracy = 97.5144% (46097/47272)
Accuracy = 97.6646% (46168/47272)
Accuracy = 97.6625% (46167/47272)
Accuracy = 97.6455% (46159/47272)
Accuracy = 97.635% (46154/47272)
Accuracy = 97.745% (46206/47272)
Accuracy = 97.3473% (46018/47272)
Accuracy = 97.5144% (46097/47272)

train -s ALPHA -e 1e-6 svmguide1.txt
predict svmguide1.t svmguide1.txt.model out.txt
ALPHA from 0 to 7: (without -n)
Accuracy = 79.025% (3161/4000)
Accuracy = 78.95% (3158/4000)
Accuracy = 78.925% (3157/4000)
Accuracy = 59.125% (2365/4000)
Accuracy = 76.625% (3065/4000)
Accuracy = 78.9% (3156/4000)
Accuracy = 79.025% (3161/4000)
Accuracy = 80.125% (3205/4000)
(with -n)
Accuracy = 78.4% (3136/4000)
Accuracy = 78.425% (3137/4000)
Accuracy = 78.425% (3137/4000)
Accuracy = 78.3% (3132/4000)
Accuracy = 78.5% (3140/4000)
Accuracy = 79.025% (3161/4000)
Accuracy = 78.225% (3129/4000)
Accuracy = 78.4% (3136/4000)

It is basically a lot faster with similar accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant