Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Is there any particular reason why you're not using Python (namely the scikit-learn and multiprocessing modules) to implement this? Using joblib, I've trained random forests on datasets of similar size in a fraction of the time it takes in R. Even without multiprocessing, random forests are significantly faster in Python. Here's a quick example of training a RF classifier and cross validating in Python. You can also easily extract feature importances and visualize the trees. </p> <pre><code>import numpy as np from sklearn.metrics import * from sklearn.cross_validation import StratifiedKFold from sklearn.ensemble import RandomForestClassifier #assuming that you have read in data with headers #first column corresponds to response variable y = data[1:, 0].astype(np.float) X = data[1:, 1:].astype(np.float) cm = np.array([[0, 0], [0, 0]]) precision = np.array([]) accuracy = np.array([]) sensitivity = np.array([]) f1 = np.array([]) matthews = np.array([]) rf = RandomForestClassifier(n_estimators=100, max_features = 5, n_jobs = 2) #divide dataset into 5 "folds", where classes are equally balanced in each fold cv = StratifiedKFold(y, n_folds = 5) for i, (train, test) in enumerate(cv): classes = rf.fit(X[train], y[train]).predict(X[test]) precision = np.append(precision, (precision_score(y[test], classes))) accuracy = np.append(accuracy, (accuracy_score(y[test], classes))) sensitivity = np.append(sensitivity, (recall_score(y[test], classes))) f1 = np.append(f1, (f1_score(y[test], classes))) matthews = np.append(matthews, (matthews_corrcoef(y[test], classes))) cm = np.add(cm, (confusion_matrix(y[test], classes))) print("Accuracy: %0.2f (+/- %0.2f)" % (accuracy.mean(), accuracy.std() * 2)) print("Precision: %0.2f (+/- %0.2f)" % (precision.mean(), precision.std() * 2)) print("Sensitivity: %0.2f (+/- %0.2f)" % (sensitivity.mean(), sensitivity.std() * 2)) print("F1: %0.2f (+/- %0.2f)" % (f1.mean(), f1.std() * 2)) print("Matthews: %0.2f (+/- %0.2f)" % (matthews.mean(), matthews.std() * 2)) print(cm) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload