I'm detecting outlets in a dataset using a squid SVM with ski-learning. I will try to explain my problem with an example:
Imagine a simple dataset with the height and performance of the features (this is just a simplification, my data set is huge). For example:
h - height p - display hp square ---------- 10 For example, when I make a strange combination between the two attributes I should find out the outlet 0.1 1 12 0.5 1 20 3.2 1 24 2. 9 1 23 0.4 -1 ...
I am scaling the data set, and I want to configure such a configuration I have been training and tracking outliers using:
clf = svm.OneClassSVM (kernel = "rbf", nu = 0.01, gamma = 0.01)
No matter when my data set is proportional. However, when the data set is diverted in the following way, I have a problem:
- Perform all the heights around 3 or 4, but display around 0 around the height of 10 . Then instead of selecting values like a square-smm height 30 and display 0.2, selects around 10 heights in the form of outliers
I divided the two groups into two SMMs The problem is solved (10 heights in a data set and other in other data sets).
Is there a way to solve this without breaking the data set?
No comments:
Post a Comment