Note that there are some explanatory texts on larger screens.

plurals
  1. POLooking for a C++ implementation of the C4.5 algorithm
    primarykey
    data
    text
    <p>I've been looking for a C++ implementation of the <a href="http://en.wikipedia.org/wiki/C4.5_algorithm" rel="nofollow">C4.5 algorithm</a>, but I haven't been able to find one yet. I found Quinlan's <a href="http://www.rulequest.com/Personal/" rel="nofollow">C4.5 Release 8</a>, but it's written in C... has anybody seen any open source C++ implementations of the C4.5 algorithm?</p> <p>I'm thinking about porting the <a href="http://weka.sourceforge.net/doc/weka/classifiers/trees/J48.html" rel="nofollow">J48 source code</a> (or simply writing a wrapper around the C version) if I can't find an open source C++ implementation out there, but I hope I don't have to do that! Please let me know if you have come across a C++ implementation of the algorithm.</p> <h2>Update</h2> <p>I've been considering the option of writing a <strong>thin C++ wrapper</strong> around the C implementation of the C5.0 algorithm (<a href="http://www.rulequest.com/see5-info.html" rel="nofollow">C5.0 is the improved version of C4.5</a>). I downloaded and compiled the C implementation of the C5.0 algorithm, but it doesn't look like it's easily portable to C++. The C implementation uses a lot of global variables and simply writing a thin C++ wrapper around the C functions will not result in an object oriented design because each class instance will be modifying the same global parameters. In other words: <em>I will have no encapsulation and that's a pretty basic thing that I need.</em> </p> <p>In order to get encapsulation I will need to make a full blown port of the C code into C++, which is about the same as porting the Java version (J48) into C++.</p> <h2>Update 2.0</h2> <p>Here are some specific requirements: </p> <ol> <li>Each classifier instance must encapsulate its own data (i.e. no global variables aside from constant ones).</li> <li>Support the concurrent training of classifiers and the concurrent evaluation of the classifiers.</li> </ol> <p>Here is a good scenario: suppose I'm doing 10-fold cross-validation, I would like to concurrently train 10 decision trees with their respective slice of the training set. If I just run the C program for each slice, I would have to run 10 processes, which is not horrible. However, if I need to classify thousands of data samples in real time, then I would have to start a new process for each sample I want to classify and that's not very efficient.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload