Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>You have mixed up the options to select a compilation phase (<code>-ptx</code> and <code>-cubin</code>) with the options to control which devices to target (<code>-code</code>), so you should revisit the documentation.</p> <p>NVCC is the NVIDIA compiler driver. The <code>-ptx</code> and <code>-cubin</code> options are used to select specific phases of compilation, by default, without any phase-specific options nvcc will attempt to produce an executable from the inputs. Most people use the <code>-c</code> option to cause nvcc to produce an object file which will later be linked into an executable by the default platform linker, the <code>-ptx</code> and <code>-cubin</code> options are only really useful if you are using the Driver API. For more information on the intermediate stages, check out the nvcc manual which is installed when you install the <a href="http://developer.nvidia.com/cuda-downloads" rel="noreferrer">CUDA Toolkit</a>.</p> <ul> <li>The output from <code>-ptx</code> is a plain-text PTX file. PTX is an intermediate assembly language for NVIDIA GPUs which has not yet been fully optimised and will later be assembled to the device-specific code (different devices have different register counts for example, hence fully optimising PTX would be wrong).</li> <li>The output from <code>-cubin</code> is a fat binary which may contain one or more device-specific binary images as well as (optionally) PTX.</li> </ul> <p>The <code>-code</code> argument you refer to has a different purpose entirely. I'd encourage you to check out the nvcc documentation which contains several examples, in general I would advise using the <code>-gencode</code> option instead since it allows more control and allows you to target multiple devices in one binary. As a quick example:</p> <ul> <li><code>-gencode arch=compute_xx,code=\'compute_xx,sm_yy,sm_zz\'</code> causes nvcc to target all devices with compute capability xx (that's the <code>arch=</code> bit) and to embed PTX (<code>code=compute_xx</code>) as well as device specific binaries for sm_yy and sm_zz into the final fat binary.</li> </ul>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload