Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Listing all potential algorithms you could use for this general task is close to impossible. Since you mentioned support vector machines (SVMs), I will try to elaborate a little on those.</p> <p>SVM classifiers never really output an actual probability. The output of an SVM classifier is the distance of the test instance to the separating hyperplane in feature space (this is called the decision value). By default, the predicted label is selected based on the sign of this decision value.</p> <p>Platt scaling basically fits a sigmoid on top of SVM decision values to scale it to the range of [0, 1], which can then be interpreted as a probability. Similar techniques can be applied on any type of classifier that produces a real-valued output.</p> <p>Some evident advantages to SVM include:</p> <ul> <li>computationally efficient nonlinear classifiers (quadratic in no. of training instances),</li> <li>can deal with high-dimensional data,</li> <li>have shown very good performance in countless domains.</li> </ul> <p>Downsides to SVM include:</p> <ul> <li>data must be vectorized,</li> <li>models are relatively hard to interpret (compared to decision trees or logistic regression),</li> <li>dealing with nominal features can be klunky,</li> <li>missing values can be very hard to deal with.</li> </ul> <p>When you are looking for proper probabilistic outputs (including confidence intervals), you may want to consider statistical methods such as logistic regression (kernelized versions exist too, but I suggest to start with the basic stuff). </p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload