StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>You could try conducting some Bayesian analysis on the text:</p> <ol> <li>Load the list of names (and their frequencies) into your program. It might be worth tokenising the names at this point. So e.g. CloseWindow becomes Close and Window, with the frequency of both incremented. At this point it would also be useful to load in some non human function names to train the program in nagatives as well.</li> <li><p>Take a function name, and using the data you have just gathered find the probability of each part coming up </p> <p>P((HumanGenerated|Seeing the Token) = P(Seeing the Token|Human Generated) * P(Humangenerated)) / P(Seeing the Token)</p></li> </ol> <p>In this case the probability of something being human or computer generated would be decided based on known knowledge i.e. what percentage of function names are thought to be human generated.</p> <p>The probability of seeing the token ( P(Seeing the Token)) would have to gradually evolve. It would consist of the number of of times the token is seen in human functions and the number of times it is seen in computer functions...this solution is based on the premise that the program learns over time (and thus needs to be trained)</p> <p>The result, P((HumanGenerated|Seeing the Token) , will give you a probability of the function name being generated by a human.</p> <p><strong>NB: This is only a rough outline, many details are missing. If you are interested in this line of investigation that I would suggested reading up on probability theory and in particular Bayesian analysis</strong> </p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload