nixpkgs/doc/cross-compilation.xml
John Ericson 5eaea6cee0 cross stdenv: let build package's build deps resolve to native packages
This fixes the "sliding window" principle:
  0. Run packages:       build = native;  host = foreign; target = foreign;
  1. Build packages:     build = native;  host = native;  target = foreign;
  2. Vanilla packages:   build = native;  host = native;  target = native;
  3. Vanilla packages:   build = native;  host = native;  target = native;
  n+3. ...

Each stage's build dependencies are resolved against the previous stage,
and the "foreigns" are shifted accordingly. Vanilla packages alone are
built against themsevles, since there are no more "foreign"s to shift away.

Before, build packages' build dependencies were resolved against
themselves:
  0. Run packages:       build = native;  host = foreign; target = foreign;
  1. Build packages:     build = native;  host = native;  target = foreign;
  2. Build packages:     build = native;  host = native;  target = foreign;
  n+2. ...

This is wrong because that principle is violated by the target
platform staying foreign.

This will change the hashes of many build packages and run packages, but
that is OK. This is an unavoidable cost of fixing cross compiling.

The cross compilation docs have been updated to reflect this fix.
2017-02-05 12:01:53 -05:00

155 lines
11 KiB
XML

<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xml:id="chap-cross">
<title>Cross-compilation</title>
<section xml:id="sec-cross-intro">
<title>Introduction</title>
<para>
"Cross-compilation" means compiling a program on one machine for another type of machine.
For example, a typical use of cross compilation is to compile programs for embedded devices.
These devices often don't have the computing power and memory to compile their own programs.
One might think that cross-compilation is a fairly niche concern, but there are advantages to being rigorous about distinguishing build-time vs run-time environments even when one is developing and deploying on the same machine.
Nixpkgs is increasingly adopting this opinion in that packages should be written with cross-compilation in mind, and nixpkgs should evaluate in a similar way (by minimizing cross-compilation-specific special cases) whether or not one is cross-compiling.
</para>
<para>
This chapter will be organized in three parts.
First, it will describe the basics of how to package software in a way that supports cross-compilation.
Second, it will describe how to use Nixpkgs when cross-compiling.
Third, it will describe the internal infrastructure supporting cross-compilation.
</para>
</section>
<!--============================================================-->
<section xml:id="sec-cross-packaging">
<title>Packing in a cross-friendly manner</title>
<section>
<title>Platform parameters</title>
<para>
The three GNU Autoconf platforms, <wordasword>build</wordasword>, <wordasword>host</wordasword>, and <wordasword>cross</wordasword>, are historically the result of much confusion.
<link xlink:href="https://gcc.gnu.org/onlinedocs/gccint/Configure-Terms.html" /> clears this up somewhat but there is more to be said.
An important advice to get out the way is, unless you are packaging a compiler or other build tool, just worry about the build and host platforms.
Dealing with just two platforms usually better matches people's preconceptions, and in this case is completely correct.
</para>
<para>
In Nixpkgs, these three platforms are defined as attribute sets under the names <literal>buildPlatform</literal>, <literal>hostPlatform</literal>, and <literal>targetPlatform</literal>.
All are guaranteed to contain at least a <varname>platform</varname> field, which contains detailed information on the platform.
All three are always defined at the top level, so one can get at them just like a dependency in a function that is imported with <literal>callPackage</literal>:
<programlisting>{ stdenv, buildPlatform, hostPlatform, fooDep, barDep, .. }: ...</programlisting>
</para>
<warning><para>
These platforms should all have the same structure in all scenarios, but that is currently not the case.
When not cross-compiling, they will each contain a <literal>system</literal> field with a short 2-part, hyphen-separated summering string name for the platform.
But, when when cross compiling, <literal>hostPlatform</literal> and <literal>targetPlatform</literal> may instead contain <literal>config</literal> with a fuller 3- or 4-part string in the manner of LLVM.
We should have all 3 platforms always contain both, and maybe give <literal>config</literal> a better name while we are at it.
</para></warning>
<variablelist>
<varlistentry>
<term><varname>buildPlatform</varname></term>
<listitem><para>
The "build platform" is the platform on which a package is built.
Once someone has a built package, or pre-built binary package, the build platform should not matter and be safe to ignore.
</para></listitem>
</varlistentry>
<varlistentry>
<term><varname>hostPlatform</varname></term>
<listitem><para>
The "host platform" is the platform on which a package is run.
This is the simplest platform to understand, but also the one with the worst name.
</para></listitem>
</varlistentry>
<varlistentry>
<term><varname>targetPlatform</varname></term>
<listitem>
<para>
The "target platform" is black sheep.
The other two intrinsically apply to all compiled software—or any build process with a notion of "build-time" followed by "run-time".
The target platform only applies to programming tools, and even then only is a good for for some of them.
Briefly, GCC, Binutils, GHC, and certain other tools are written in such a way such that a single build can only compiler code for a single platform.
Thus, when building them, one must think ahead about what platforms they wish to use the tool to produce machine code for, and build binaries for each.
</para>
<para>
There is no fundamental need to think about the target ahead of time like this.
LLVM, for example, was designed from the beginning with cross-compilation in mind, and so a normal LLVM binary will support every architecture that LLVM supports.
If the tool supports modular or pluggable backends, one might imagine specifying a <emphasis>set</emphasis> of target platforms / backends one wishes to support, rather than a single one.
</para>
<para>
The biggest reason for mess, if there is one, is that many compilers have the bad habit a build process that builds the compiler and standard library/runtime together.
Then the specifying target platform is essential, because it determines the host platform of the standard library/runtime.
Nixpkgs tries to avoid this where possible too, but still, because the concept of a target platform is so ingrained now in Autoconf and other tools, it is best to support it as is.
Tools like LLVM that don't need up-front target platforms can safely ignore it like normal packages, and it will do no harm.
</para>
</listitem>
</varlistentry>
</variablelist>
<note><para>
If you dig around nixpkgs, you may notice there is also <varname>stdenv.cross</varname>.
This field defined as <varname>hostPlatform</varname> when the host and build platforms differ, but otherwise not defined at all.
This field is obsolete and will soon disappear—please do not use it.
</para></note>
</section>
<section>
<title>Specifying Dependencies</title>
<para>
As mentioned in the introduction to this chapter, one can think about a build time vs run time distinction whether cross-compiling or not.
In the case of cross-compilation, this corresponds with whether a derivation running on the native or foreign platform is produced.
An interesting thing to think about is how this corresponds with the three Autoconf platforms.
In the run-time case, the depending and depended-on package simply have matching build, host, and target platforms.
But in the build-time case, one can imagine "sliding" the platforms one over.
The depended-on package's host and target platforms (respectively) become the depending package's build and host platforms.
This is the most important guiding principle behind cross-compilation with Nixpkgs, and will be called the <wordasword>sliding window principle</wordasword>.
In this manner, given the 3 platforms for one package, we can determine the three platforms for all its transitive dependencies.
</para>
<para>
Some examples will probably make this clearer.
If a package is being built with a <literal>(build, host, target)</literal> platform triple of <literal>(foo, bar, bar)</literal>, then its build-time dependencies would have a triple of <literal>(foo, foo, bar)</literal>, and <emphasis>those packages'</emphasis> build-time dependencies would have triple of <literal>(foo, foo, foo)</literal>.
In other words, it should take two "rounds" of following build-time dependency edges before one reaches a fixed point where, by the sliding window principle, the platform triple no longer changes.
Indeed, this happens with cross compilation, where only rounds of native dependencies starting with the second necessarily coincide with native packages.
</para>
<note><para>
The depending package's target platform is unconstrained by the sliding window principle, which makes sense in that one can in principle build cross compilers targeting arbitrary platforms.
</para></note>
<para>
How does this work in practice? Nixpkgs is now structured so that build-time dependencies are taken from from <varname>buildPackages</varname>, whereas run-time dependencies are taken from the top level attribute set.
For example, <varname>buildPackages.gcc</varname> should be used at build time, while <varname>gcc</varname> should be used at run time.
Now, for most of Nixpkgs's history, there was no <varname>buildPackages</varname>, and most packages have not been refactored to use it explicitly.
Instead, one can use the four attributes used for specifying dependencies as documented in <link linkend="ssec-stdenv-attributes" />.
We "splice" together the run-time and build-time package sets with <varname>callPackage</varname>, and then <varname>mkDerivation</varname> for each of four attributes pulls the right derivation out.
This splicing can be skipped when not cross compiling as the package sets are the same, but is a bit slow for cross compiling.
Because of this, a best-of-both-worlds solution is in the works with no splicing or explicit access of <varname>buildPackages</varname> needed.
For now, feel free to use either method.
</para>
</section>
</section>
<!--============================================================-->
<section xml:id="sec-cross-usage">
<title>Cross-building packages</title>
<para>
To be written.
This is basically unchanged so see the old wiki for now.
</para>
</section>
<!--============================================================-->
<section xml:id="sec-cross-infra">
<title>Cross-compilation infrastructure</title>
<para>To be written.</para>
<note><para>
If one explores nixpkgs, they will see derivations with names like <literal>gccCross</literal>.
Such <literal>*Cross</literal> derivations is a holdover from before we properly distinguished between the host and target platforms
—the derivation with "Cross" in the name covered the <literal>build = host != target</literal> case, while the other covered the <literal>host = target</literal>, with build platform the same or not based on whether one was using its <literal>.nativeDrv</literal> or <literal>.crossDrv</literal>.
This ugliness will disappear soon.
</para></note>
</section>
</chapter>