The following section is experimental. To compile Auryn in native mode for the Intel Xeon Phi cards the following steps worked. Requirements: First, use the newest Auryn development version, which does not compile against GSL any more. This saves you some extra trouble in compiling GSL. Second, you need the Intel Composer suite. Third, you need the boost libraries (see required libraries) compiled for the Intel MIC architecture ideally using the Intel MPI implementation which is native to Xeon and the Xeon Phis.
This has been tested for the development version available on GitHub as commit 5b02d058844ea9b6344cc4e94491c10cf5ea0522
.
For the most part you can follow one of Boosts tutorials for the compilation http://www.boost.org/doc/libs/1_55_0/doc/html/mpi/getting_started.html#mpi.config. In particular I set user-config.jam
to contain the line
using mpi : /opt/intel/impi/4.1.3.048/bin/mpigxx -cxx=/opt/intel/bin/icpc -static_mpi : -L/opt/intel/impi/4.1.3.048/lib ;
to enable MPI and to tell the build toolchain Jam which compiler to use. Note that you might have to adapt the version numbers in this example to your actual version.
Building of boost was then achieved by using a command line similar to the following. It is important to pass the -mmic
arguments to tell the Intel compiler that it is cross compiling for the Xeon Phi:
./b2 toolset=intel --disable-icu --without-iostreams cflags="-mmic" cxxflags="-mmic -I/opt/intel/impi/4.1.3.048/mic/include/" linkflags="-mmic" --debug-configuration
To run smoothly on the architecture and to make efficient use of vectorization you need to set the following switches in the auryn_definitions.h
in the src/
directory of Auryn. In particular you have to deactivate prefetching and activate SIMD instructions with Intel CILK Plus support.
To do so find the following precompile directives in auryn_definitions.h
and un/comment them accordingly:
// #define CODE_ACTIVATE_PREFETCHING_INTRINSICS #define CODE_USE_SIMD_INSTRUCTIONS_EXPLICITLY #define CODE_ACTIVATE_CILK_INSTRUCTIONS
With this done one simply has to create a new subdir in the Auryn build directory and add the following Makefile:
.SECONDARY: CC = /opt/intel/impi/4.1.3.048/bin/mpicxx -cxx=/opt/intel/bin/icc CXX = /opt/intel/impi/4.1.3.048/bin/mpicxx -cxx=/opt/intel/bin/icc CFLAGS= -ansi -Wall -pipe -O3 -mmic -static_mpi -static-intel -vec-report2 -pedantic -I$(SRCDIR) -I$(DEVDIR) LDFLAGS= -L/opt/intel/composer_xe_2013_sp1.2.144/compiler/lib/mic/ -lboost_program_options -lboost_serialization -lboost_mpi -limf -lsvml -lirng -lintlc include ../Makefile.include
Again the path variables will probably need to be updated to your needs. Furthermore depending on where you installed the Boost-for-MIC libraries you might have to add a cflag -I/lcncluster/zenke/lib/build/boost_1_55_0/
and similarly a linkflag -L/yourpath/boost_1_55_0/stage/lib
.
Now you are almost set for compilation. However, before you compile icc
generally requires you to update your LD_LIBRARY_PATH
and MIC_LD_LIBRARY_PATH
. This is most conveniently achieved by
source /opt/intel/impi/4.1.3.048/bin/mpivars.sh source /opt/intel/composer_xe_2013_sp1.2.144/bin/compilervars_arch.sh mic
Invoking make
will now build the simulation binaries.
To run a simulation (e.g. sim_background) in native mode on a Xeon Phi card you need to copy the binary and the required shared libraries to the card. This essentially follows the steps outlined in this Intel tutorial https://software.intel.com/en-us/articles/using-the-intel-mpi-library-on-intel-xeon-phi-coprocessor-systems.
Finally the code can be run on the coprocessor card:
[xyz@srv1-mic0 tmp]# mpirun -n 60 ./sim_background --dir /tmp --simtime 10
Enjoy!