Note that there are some explanatory texts on larger screens.

plurals
  1. POUnderlying technique of Android's FaceDetector
    text
    copied!<p>I'm implementing a face tracker on Android, and as a literature study, would like to identify the underlying technique of Android's FaceDetector.</p> <p>Simply put: I want to understand how the <code>android.media.FaceDetector</code> classifier works.</p> <p>A brief Google search didn't yield anything informative, so I thought I'd take a look at the code.</p> <p>By looking at the Java source code, <a href="https://android.git.kernel.org/?p=platform/frameworks/base.git;a=blob;f=media/java/android/media/FaceDetector.java" rel="noreferrer"><code>FaceDetector.java</code></a>, there isn't much to be learned: <code>FaceDetector</code> is simply a class that is provided the image dimensions and number of faces, then returns an array of faces.</p> <p>The Android source <a href="https://android.git.kernel.org/?p=platform/external/neven.git;a=tree" rel="noreferrer">contains the JNI code for this class</a>. I followed through the function calls, where, reduced to the bare essentials, I learned:</p> <ol> <li>The "FaceFinder" is created in <a href="https://android.git.kernel.org/?p=platform/external/neven.git;a=blob;f=FaceRecEm/common/src/b_FDSDK/FaceFinder.c;h=b24ac111f98dd5a580849adb72c434a36479b135;hb=HEAD#l75" rel="noreferrer"><code>FaceFinder.c:75</code></a></li> <li>On line 90, <code>bbs_MemSeg_alloc</code> returns a <code>btk_HFaceFinder</code> object (which contains the function to actually find faces), essentially copying it the <code>hsdkA-&gt;contextE.memTblE.espArrE</code> array of the original <code>btk_HSDK</code> object initialized within initialize() (<a href="https://android.git.kernel.org/?p=platform/external/neven.git;a=blob;f=FaceDetector_jni.cpp;h=03bd908bed10527a782b9d74770ffc9be2f4aeb4;hb=HEAD#l145" rel="noreferrer"><code>FaceDetector_jni.cpp:145</code></a>) by <code>btk_SDK_create()</code></li> <li>It appears that a maze of functions provide each other with pointers and instances of <code>btk_HSDK</code>, but nowhere can I find a concrete instantiation of <code>sdk-&gt;contextE.memTblE.espArrE[0]</code> that supposedly contains the magic.</li> </ol> <p>What I <em>have</em> discovered, is a little clue: the JNI code references a FFTEm library that I can't find the source code for. By the looks of it, however, FFT is <em>Fast Fourier Transform</em>, which is probably used together with a pre-trained neural network. The only literature I can find that aligns with this theory is <a href="http://people.ee.ethz.ch/~bfasel/papers/avbpa_face.pdf" rel="noreferrer">a paper by Ben-Yacoub et al.</a></p> <p>I don't even really know if I'm set on the right path, so any suggestions at all would undoubtedly help.</p> <p><strong>Edit:</strong> I've added a +100 bounty for anybody who can give any insight.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload