Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>When you run <code>g++ -I. -l. ABC.cpp</code> you are asking the compiler to create an executable out of <code>ABC.cpp</code>. But the code in this file replies on a function defined in <code>XYZ.cpp</code>, so the executable cannot be created due to that missing function.</p> <p>You have two options (depending on what it is that you want to do). Either you give the compiler all of the source files at once so that it has all the definitions, e.g.</p> <pre><code> g++ -I. -l. ABC.cpp XYZ.cpp </code></pre> <p>or, you use the <code>-c</code> option compile to ABC.cpp to object code (.obj on Windows, .o in Linux) which can be linked later, e.g.</p> <pre><code> g++ -I. -l. -c ABC.cpp </code></pre> <p>Which will produce <code>ABC.o</code> which can be linked later with <code>XYZ.o</code> to produce an executable.</p> <p><strong>Edit: What is the difference between #including and linking?</strong></p> <p>Understanding this fully requires understanding exactly what happens when you compile a C++ program, which unfortunately even many people who consider themselves to be C++ programmers do not. At a high level, the compilation of a C++ program goes through three stages: preprocessing, compilation, and linking.</p> <p><strong>Preprocessing</strong></p> <p>Every line that starts with <code>#</code> is a <em>preprocessor directive</em> which is evaluated at the preprocessing stage. The <code>#include</code> directive is <em>literally</em> a copy-and-paste. If you write <code>#include "XYZ.h"</code>, the preprocessor replaces that line with the entire contents of <code>XYZ.h</code> (including recursive evaluations of <code>#include</code> within <code>XYZ.h</code>). </p> <p>The purpose of including is to make declarations visible. In order to use the function <code>GetOneGaussianByBoxMuller</code>, the compiler needs to know that <code>GetOneGaussianByBoxMuller</code> is a function, and to know what (if any) arguments it takes and what value it returns, the compiler will need to see a declaration for it. Declarations go in header files, and header files are included to make declarations visible to the compiler before the point of use.</p> <p><strong>Compiling</strong></p> <p>This is the part where the compiler runs and turns your source code into machine code. Note that machine code is not the same thing as <em>executable</em> code. An executable requires additional information about how to load the machine code and the data into memory, and how to bring in external dynamic libraries if necessary. That's not done here. This is just the part where your code goes from C++ to raw machine instructions.</p> <p>Unlike Java, Python, and some other languages, C++ has no concept of a "module". Instead, C++ works in terms of <em>translation units</em>. In nearly all cases, a translation unit corresponds to a single (non-header) source code file, e.g. <code>ABC.cpp</code> or <code>XYZ.cpp</code>. Each translation unit is compiled independently (whether you run separate <code>-c</code> commands for them, or you give them to the compiler all at once). </p> <p>When a source file is compiled, the preprocessor runs first, and does the <code>#include</code> copy-pasting as well as macros and other things that the preprocessor does. The result is one long stream of C++ code consisting of the contents of the source file and everything included by it (and everything included by what it included, etc...) This long stream of code is the translation unit.</p> <p>When the translation unit is compiled, every function and every variable used must be <em>declared</em>. The compiler will not allow you to call a function for which there is no declaration or to use a global variable for which there is no declaration, because then it wouldn't know the types, parameters, return values, etc, involved and could not generate sensible code. That's why you need headers -- keep in mind that at this point the compiler is not even remotely aware of the existence of any other source files; it is only considering this stream of code produced by the processing of the <code>#include</code> directives.</p> <p>In the machine code produced by the compiler, there are no such things as variable names or function names. Everything must become a memory address. Every global variable must be translated to a memory address where it is stored, and every function must have a memory address that the flow of execution jumps to when it is called. For things that are <em>defined</em> (i.e. for functions, <em>implemented</em>) in the translation unit, the compiler can assign an address. For things that are only <em>declared</em> (usually as a result of included headers) and not defined, the compiler does not at this point know what the memory address should be. These functions and global variables for which the compiler has only a declaration but not a definition/implementation, are called <em>external symbols</em>, and they are presumed to exist in a different translation unit. For now, their memory addresses are represented with placeholders.</p> <p>For example, when compiling the translation unit corresponding to <code>ABC.cpp</code>, it has a definition (implementation) of <code>ABC</code>, so it can assign an address to the function <code>ABC</code> and wherever in that translation unit <code>ABC</code> is called, it can create a jump instruction to that address. On the other hand, although its declaration is visible, <code>GetOneGaussianByBoxMuller</code> is not implemented in that translation unit, so its address must be represented with a placeholder.</p> <p>The result of compiling a translation unit is an <em>object file</em> (with the <code>.o</code> suffix on Linux).</p> <p><strong>Linking</strong></p> <p>One of the main jobs of the linker is to <em>resolve</em> external symbols. That is, the linker looks through a set of object files, sees what their external symbols are, and then tries to find out what memory address should be assigned to them, replacing the placeholder.</p> <p>In your case the function <code>GetOneGaussianByBoxMuller</code> is <em>defined</em> in the translation unit corresponding to <code>XYZ.cpp</code>, so inside <code>XYZ.o</code> it has been assigned a specific memory address. In the translation unit corresponding to <code>ABC.cpp</code>, it was only <em>declared</em>, so inside <code>ABC.o</code>, it is only a placeholder (external symbol). The linker, if given both <code>ABC.o</code> and <code>XYZ.o</code> will see that <code>ABC.o</code> needs an address filled in for <code>GetOneGaussianByBoxMuller</code>, find that address in <code>XYZ.o</code>, and replace the placeholder in <code>ABC.o</code> with it. Addresses for external symbols can also be found in libraries. </p> <p>If the linker fails to find an address for <code>GetOneGaussianByBoxMuller</code> (as it does in your example where it is only working on <code>ABC.o</code>, as a result of not having passed <code>XYZ.cpp</code> to the compiler), it will report an unresolved external symbol error, also described as an <em>undefined reference</em>.</p> <p>Finally, once the compiler has resolved all external symbols, it combines all of the now-placeholder-free object code, adds in all the loading information that the operating system needs, and produces an executable. Tada!</p> <p>Note that through all of this, the names of the files don't matter one bit. It's a <em>convention</em> that <code>XYZ.h</code> should contain declarations for things that are defined in <code>XYZ.cpp</code>, and it's good for maintainable code to organize things that way, but the compiler and linker don't care one bit whether that's true or not. The linker will look through <em>all</em> the object files it's given and <em>only</em> the object files it's given to try to resolve a symbol. It neither knows nor cares which header the declaration of the symbol was in, and it will not try to automatically pull in other object files or compile other source files in order to resolve a missing symbol.</p> <p>... wow, that was long.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload