Quantcast
Channel: Clusters and HPC Technology
Viewing all 930 articles
Browse latest View live

Join the Intel® Parallel Studio XE 2018 Beta program

$
0
0

We would like to invite you to participate in the Intel® Parallel Studio XE 2018 Beta program. In this beta test, you will gain early access to new features and analysis techniques. Try them out, tell us what you love and what to improve, so we can make our products better for you. 

Registration is easy. Complete the pre-beta survey, register, and download the beta software:

Intel® Parallel Studio XE 2018 Pre-Beta survey

The 2018 version brings together exciting new technologies along with improvements to Intel’s existing software development tools:

Modernize Code for Performance, Portability and Scalability on the Latest Intel® Platforms

  • Use fast Intel® Advanced Vector Extensions 512 (Intel®AVX-512) instructions on Intel® Xeon® and Intel®Xeon® Phi™ processors and coprocessors
  • Intel® Advisor - Roofline finds high impact, but under optimized loops
  • Intel® Distribution for Python* - Faster Python* applications
  • Stay up-to-date with the latest standards and IDE:
    • C++2017 draft parallelizes and vectorizes C++ easily using Parallel STL*
    • Full Fortran* 2008, Fortran 2015 draft
    • OpenMP* 5.0 draft, Microsoft Visual Studio* 2017
  • Accelerate MPI applications with Intel® Omni-Path Architecture

Flexibility for Your Needs

  • Application Snapshot - Quick answers:  Does my hybrid code need optimization?
  • Intel® VTune™ Amplifier – Profile private clouds with Docker* and Mesos* containers, Java* daemons

And much more…

For more details about this beta program, a FAQ, and What’s New, visit: Intel® Parallel Studio XE 2018 Beta page.

As a highly-valued customer and beta tester, we welcome your feedback to our development teams via this program at our Online Service Center.


When I_MPI_FABRICS=shm, the size of MPI_Bcast can't larger than 64kb

$
0
0

I run MPI with a single workstation( 2 x E5 2690).

When I export I_MPI_FABRICS=shm, the size of MPI_Bcast can't larger than 64kb.

But when I export I_MPI_FABRICS={shm,tcp}, everything is ok.

Are there some limit for shm? Can I adjust the limit?

Thread Topic: 

Question

IMB and --input, 2017.0.2

$
0
0

Hi,

   I've been running IMB with the following command

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Andale Mono'; color: #ffffff; background-color: #2c67c8}
span.s1 {font-variant-ligatures: no-common-ligatures}
span.s2 {font-variant-ligatures: no-common-ligatures; color: #ffa853}
span.s3 {font-variant-ligatures: no-common-ligatures; color: #7c1b80}
span.s4 {font-variant-ligatures: no-common-ligatures; color: #760b00}
span.s5 {font-variant-ligatures: no-common-ligatures; background-color: #e6e600}

mpirun -n $i -genv I_MPI_DEBUG 5 -machinefile ./nodefile src/IMB-MPI1 -npmin $i -input IMB_SELECT_MPI1 -msglen ./msglens

where $i is set to 2, 4, etc., and the file IMB_SELECT_MPI1 reads

--BOF--
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Andale Mono'; color: #ffffff; background-color: #2c67c8}
span.s1 {font-variant-ligatures: no-common-ligatures}

#

Alltoall

#

Allgather

#

Allreduce

#

Sendrecv

#

Exchange

#

Uniband

#

Biband

--EOF--

What happens is that Biband gets run 7 times, (one for each targeted test), rather than different tests as desired:

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Andale Mono'; color: #ffffff; background-color: #2c67c8}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Andale Mono'; color: #ffffff; background-color: #2c67c8; min-height: 16.0px}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Andale Mono'; color: #ffffff; background-color: #e6e600}
span.s1 {font-variant-ligatures: no-common-ligatures}
span.s2 {font-variant-ligatures: no-common-ligatures; background-color: #2c67c8}
span.s3 {font-variant-ligatures: no-common-ligatures; background-color: #e6e600}

# List of Benchmarks to run:

 

# Biband

# Biband

# Biband

# Biband

# Biband

# Biband

# Biband

 

#---------------------------------------------------

# Benchmarking Biband

# #processes = 4

#---------------------------------------------------

       #bytes #repetitions   Mbytes/sec      Msg/sec

            0         1000         0.00      7199701

        65536          640     13233.43       211735

       524288           80     19708.07        39416

      4194304           10     22623.73         5656

 

#---------------------------------------------------

# Benchmarking Biband

# #processes = 4

...

Any idea why this is happening? Full stdout attached (there are some bogus runs there, but focus on the Biband output).

Thanks; Chris

AttachmentSize
Downloadtext/plainIMB.out_.txt215.96 KB

Zone: 

Thread Topic: 

Bug Report

PSM2_MQ_RECVREQS_MAX limit reached

$
0
0

Hi,

one of our user report us a problem with MPI_Gatherv of intelmpi 2017.

The problem is related to the maximum number of irecv requests in flight.

To reproduce the problem we set up a test case and run it (the code used is shown below) with 72 MPI tasks on two nodes, each one containing 2x Broadwell processors (18 cores per socket). The inter-node communication fabric is Omni-Path.

At runtime the program crashes returning the following error message:

Exhausted 1048576 MQ irecv request descriptors, which
usually indicates a user program error or insufficient request
descriptors (PSM2_MQ_RECVREQS_MAX=1048576)

By setting the value of the variable PSM2_MQ_RECVREQS_MAX to a higher value seems to solve the problem.

Also putting an MPI barrier after the gatherv call solves the problem, although with the side effect of forcing tasks synchronization.

Two questions now arise:

1. Are there any known side-effects by setting PSM2_MQ_RECVREQS_MAX to a very large value?
Can that affect resource requirements of my program, just as memory for example?

2. Alternatively, is there a more robust way to limit the maximum number of irecv requests in flight, so as not to cause the program fault?

Best Regards,
 

Stefano

 

Here is the code:

#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main(int argc, char **argv)
{
  MPI_Init(&argc, &argv);

  int size, rank;
  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  int iterations = 100000;

  int send_buf[1] = {rank};

  int *recv_buf = NULL;
  int *recvcounts = NULL;
  int *displs = NULL;

  int recv_buf_size = size;
  if (rank == 0) {
    recv_buf = calloc(recv_buf_size, sizeof(*recv_buf));
    for (int i = 0; i < recv_buf_size; i++) {
      recv_buf[i] = -1;
    }
    recvcounts = calloc(size, sizeof(*recvcounts));
    displs = calloc(size, sizeof(*displs));
    for (int i = 0; i < size; i++) {
      recvcounts[i] = 1;
      displs[i] = i;
    }
  }
  int ten_percent = iterations / 10;
  int progress = 0;
  MPI_Barrier(MPI_COMM_WORLD);
  for (int i = 0; i < iterations; i++) {
    if (i >= progress) {
      if (rank == 0) printf("Starting iteration %d\n", i);
      progress += ten_percent;
    }
    MPI_Gatherv(send_buf, 1, MPI_INT, recv_buf, recvcounts, displs, MPI_INT, 0, MPI_COMM_WORLD);
  }
  if (rank == 0) {
    for (int i = 0; i < recv_buf_size; i++) {
      assert(recv_buf[i] == i);
    }
  }

  free(recv_buf);
  free(recvcounts);
  free(displs);

  MPI_Finalize();
}

 

Thread Topic: 

Question

“make check” fails when compiling parallel HDF5 with intel compilers!

$
0
0

Greetings,

I have problem in "make check" of compiling parallel HDF5 using Intel Compilers on CentOS 6.5 64bit. I exactly follow the procedure described in this page but still get this error:

Testing  t_pflush1
============================
 t_pflush1  Test Log
============================
Testing H5Fflush (part1)                                              *** Hint ***
You can use environment variable HDF5_PARAPREFIX to run parallel test files in a
different directory or to add file type prefix. E.g.,
   HDF5_PARAPREFIX=pfs:/PFS/user/me
   export HDF5_PARAPREFIX
*** End of Hint ***
rank 5 in job 8  iwf3_56776   caused collective abort of all ranks
  exit status of rank 5: return code 0
rank 4 in job 8  iwf3_56776   caused collective abort of all ranks
  exit status of rank 4: return code 0
0.04user 0.01system 0:00.31elapsed 19%CPU (0avgtext+0avgdata 47024maxresident)k
0inputs+0outputs (0major+7458minor)pagefaults 0swaps
make[4]: *** [t_pflush1.chkexe_] Error 1
make[4]: Leaving directory `/home/iwf/hdf5/hdf5-1.8.18/testpar'
make[3]: *** [build-check-p] Error 1
make[3]: Leaving directory `/home/iwf/hdf5/hdf5-1.8.18/testpar'
make[2]: *** [test] Error 2
make[2]: Leaving directory `/home/iwf/hdf5/hdf5-1.8.18/testpar'
make[1]: *** [check-am] Error 2
make[1]: Leaving directory `/home/iwf/hdf5/hdf5-1.8.18/testpar'
make: *** [check-recursive] Error 1

any idea?

My configuration summary:

SUMMARY OF THE HDF5 CONFIGURATION
        =================================

General Information:
-------------------
           HDF5 Version: 1.8.18
          Configured on: Wed Apr 19 14:41:23 +0430 2017
          Configured by: iwf@iwf3
         Configure mode: production
            Host system: x86_64-unknown-linux-gnu
          Uname information: Linux iwf3 2.6.32-504.23.4.el6.x86_64 #1 SMP Tue Jun 9 20:57:37 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
               Byte sex: little-endian
              Libraries: static, shared
         Installation point: /home/iwf/hdf5/1.8.18

Compiling Options:
------------------
               Compilation Mode: production
                     C Compiler: /opt/intel//impi/5.0.1.035/intel64/bin/mpicc ( built with gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC))
                         CFLAGS:
                      H5_CFLAGS: -std=c99 -pedantic -Wall -Wextra -Wundef -Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align -Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wredundant-decls -Wnested-externs -Winline -Wno-long-long -Wfloat-equal -Wmissing-format-attribute -Wmissing-noreturn -Wpacked -Wdisabled-optimization -Wformat=2 -Wunreachable-code -Wendif-labels -Wdeclaration-after-statement -Wold-style-definition -Winvalid-pch -Wvariadic-macros -Wnonnull -Winit-self -Wmissing-include-dirs -Wswitch-default -Wswitch-enum -Wunused-macros -Wunsafe-loop-optimizations -Wc++-compat -Wstrict-overflow -Wlogical-op -Wlarger-than=2048 -Wvla -Wsync-nand -Wframe-larger-than=16384 -Wpacked-bitfield-compat -O3
                      AM_CFLAGS:
                       CPPFLAGS:
                    H5_CPPFLAGS: -D_GNU_SOURCE -D_POSIX_C_SOURCE=200112L   -DNDEBUG -UH5_DEBUG_API
                    AM_CPPFLAGS:  -I/home/iwf/zlib/1.2.11//include
               Shared C Library: yes
               Static C Library: yes
  Statically Linked Executables: no
                        LDFLAGS:
                     H5_LDFLAGS:
                     AM_LDFLAGS:  -L/home/iwf/zlib/1.2.11//lib
        Extra libraries: -lz -ldl -lm
               Archiver: ar
             Ranlib: ranlib
          Debugged Packages:
 API Tracing: no

Languages:
----------
                        Fortran: no

                            C++: no

Features:
---------
                  Parallel HDF5: yes
             High Level library: yes
                   Threadsafety: no
            Default API Mapping: v18
 With Deprecated Public Symbols: yes
         I/O filters (external): deflate(zlib)
                            MPE:
                     Direct VFD: no
                        dmalloc: no
Clear file buffers before write: yes
           Using memory checker: no
         Function Stack Tracing: no
      Strict File Format Checks: no
   Optimization Instrumentation: no

Thread Topic: 

Help Me

Executing a BAT sript with white space in its path + passing arguments with white spaces

$
0
0

Dear All, 

We recently switched from mpich2 to intel mpi. I have problems starting a bat script when the path to the script and the argument passed to the script both contain whitespaces: 

"path-To-MPI\mpiexec" -delegate -n 1 "path with spaces"\test.bat "argument with space"

Is there a solution to this problem ? 

Thanks

Diego

Example 1) : 

J:\Users\diego>"C:\Users\diego\AppData\Local\XXX Software\MPI\impi5\mpiexec.exe" -delegate -n 1 "J:\Users\diego\abc DRuiz\test2.bat""tu tu"

'J:\Users\diego\abc' is not recognized as an internal or external command, operable program or batch file.

2) If my argument does not contain white spaces, then it works 

J:\Users\diego>"C:\Users\diego\AppData\Local\XXX Software\MPI\impi5\mpiexec.exe" -delegate -n 1 "J:\Users\diego\abc DRuiz\test2.bat""tu-tu"

J:\Users\diego>echo LOCAL "test2.BAT" STARTED
LOCAL "test2.BAT" STARTED

3) If I use an executable instead of a BAT script, it also works: 

J:\Users\diego>"abc DRuiz\qt2.bat

J:\Users\diego>"C:\Users\diego\AppData\Local\XXX Software\MPI\impi5\mpiexec.exe" -delegate -n 1 "J:\Users\diego\abc DRuiz\bin\abcd.exe""tu tu"

  <***>   Software - rev 98765
Info    : Welcome diego

...                     

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

 

 

 

Zone: 

Thread Topic: 

How-To

MPS in 2018 Beta?

$
0
0

I don't find the mpsvars.sh environment file in <inst dir>/itac_2018/bin  directory (or anywhere else)

Is MPS not in the 2018 beta?

Will MPS appear in a 2018 Beta Update?

Will MPS appear in the 2018 Release?

Has MPS been superceded by the Vtune Application Performance Snapshot tool?

Thanks

Ron

old intel/860 parallel program

$
0
0

I am trying to launch an old c program developed for the intel / 860 but I do not know if there could be the MPI equivalent of the following functions:
crecv
gxsum
gsendx
irec
isend
msgcancel
msgwait

Zone: 

Thread Topic: 

Question

Error compiling FFTW3X_CDFT Wrapper on Intel Parallel Studio XE Cluster Ed

$
0
0

Hi,

OS: SLES11 SP4

kernel: 3.0.101-97

I followed the instructions in:

   https://software.intel.com/en-us/node/522284#566B1CCD-F68B-4E33-BAB2-082...

Command:

   $ make libintel64 interface=ilp64

ERROR: ar: error while loading shared libraries: libbfd-2.24.0.20140403-3.so: cannot open shared object file: No such file or directory
make[1]: *** [wraplib] Error 127

If I make a symbolic link in OS library like:

   $ ln -s /usr/lib64/libbfd-2.25.0.so /usr/lib/64/libbfd-2.24.0.20140403-3.so

It works with but Segment Violation appears:

...

ar: creating ../../lib/intel64/libfftw3x_cdft_ilp64.a
make[1]: *** [wraplib] Segment Violation
make[1]: Leaving directory `/mnt/data/applications/Compilers/intel/compilers_and_libraries_2017.3.191/linux/mkl/interfaces/fftw3x_cdft'
make: *** [libintel64] Error 2

 

Any idea?

 

Thanks in advance.

 

 

 

 

Using intel mpi in parallel ANSYS fluent with AMD processors

$
0
0

I set up and use successfully this tutorial (https://goo.gl/Ww6bkM) for clustering 2 machine to run the ansys fluent 17.2 in parallel mode. machine 1 (node1) is win server 2012 R2 64bit and machine 2 (node2) is win 8 64bit. both of them use Intel core i3 3.10 GHZ and 8GB memory. this tutorial use intel mpi for paralleling.

Know i set it up for 2 machine with same OS like above, but with quad core AMD opteron processor 2378 2.4GHZ and 16 GB memory.
When i run for parallel processing according the tutorial this error appear.
https://goo.gl/p3LaLf

Is it possible that intel mpi dont compatible with this AMD processors for parallel run or other reason has been caused this? what should i do?
thanks.

Zone: 

error in compiling FFTW3 with intel compiler

$
0
0

Dear all,

I'm trying to build FFTW3 with intel compiler, according to the guide in FFTW website. I configure FFTW3 as

./configure CC=icc F77=ifort MPICC=mpiicc --enable-mpi

however, error reported as

checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... no
checking whether to build static libraries... yes
checking for ocamlbuild... no
checking for mpicc... /home/loam/intel/compilers_and_libraries_2017.4.196/linux/mpi/intel64/bin/mpiicc
checking for MPI_Init... no
checking for MPI_Init in -lmpi... no
checking for MPI_Init in -lmpich... no
configure: error: could not find mpi library for --enable-mpi

I also tried add 

LDFLAGS=-L/home/loam/intel/compilers_and_libraries_2017.4.196/linux/mpi/intel64/lib

CPPFLAGS=-I/home/loam/intel/compilers_and_libraries_2017.4.196/linux/mpi/intel64/include

but it still didn't work.

Does anyone could give some advices? Thank you!

 

Zone: 

Thread Topic: 

How-To

MPI_Scatterv/ Gatherv using C++ with "large" 2D matrices throws MPI errors

$
0
0

I implemented some `MPI_Scatterv` and `MPI_Gatherv` routines for a parallel matrix matrix multiplication. Everything works fine for small matrix sizes up to N = 180, if I exceed this size, e.g. N = 184 MPI throws some errors while using `MPI_Scatterv`. 

For the 2D Scatter I used some constructions with MPI_Type_create_subarray and MPI_TYPE_create_resized. Explanations of these constructions can be found in this question http://stackoverflow.com/questions/9269399/sending-blocks-of-2d-array-in-c-using-mpi.

The minimal example code I wrote filles a matrix A with some values scatters it to the local processes and write the rank number of each process in the local copy of the scattered A. After that the local copies will be gathered to the master process.

    #include "mpi.h"

    #define N 184 // grid size
    #define procN 2  // size of process grid

    int main(int argc, char **argv) {
        double* gA = nullptr; // pointer to array
        int rank, size;       // rank of current process and no. of processes

        // mpi initialization
        MPI_Init(&argc, &argv);
    	MPI_Comm_size(MPI_COMM_WORLD, &size);
    	MPI_Comm_rank(MPI_COMM_WORLD, &rank);

        // force to use correct number of processes
        if (size != procN * procN) {
    		if (rank == 0) fprintf(stderr,"%s: Only works with np = %d.\n", argv[0], procN *  procN);
            MPI_Abort(MPI_COMM_WORLD,1);
        }

        // allocate and print global A at master process
        if (rank == 0) {
            gA = new double[N * N];
            for (int i = 0; i < N; i++) {
                for (int j = 0; j < N; j++) {
                    gA[j * N + i] = j * N + i;
    			}
            }

            printf("A is:\n");
            for (int i = 0; i < N; i++) {
                for (int j = 0; j < N; j++) {
                    printf("%f ", gA[j * N + i]);
    			}
                printf("\n");
            }
        }

        // create local A on every process which we'll process
        double* lA = new double[N / procN * N / procN];

        // create a datatype to describe the subarrays of the gA array
        int sizes[2]    = {N, N}; // gA size
        int subsizes[2] = {N / procN, N / procN}; // lA size
        int starts[2]   = {0,0}; // where this one starts
        MPI_Datatype type, subarrtype;
        MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_C, MPI_DOUBLE, &type);
        MPI_Type_create_resized(type, 0, N / procN * sizeof(double), &subarrtype);
        MPI_Type_commit(&subarrtype);

        // compute number of send blocks
        // compute distance between the send blocks
        int sendcounts[procN * procN];
        int displs[procN * procN];

        if (rank == 0) {
            for (int i = 0; i < procN * procN; i++) {
                sendcounts[i] = 1;
            }
            int disp = 0;
            for (int i = 0; i < procN; i++) {
                for (int j = 0; j < procN; j++) {
                    displs[i * procN + j] = disp;
                    disp += 1;
                }
                disp += ((N / procN) - 1) * procN;
            }
        }

        // scatter global A to all processes
        MPI_Scatterv(gA, sendcounts, displs, subarrtype, lA,
                     N*N/(procN*procN), MPI_DOUBLE,
                     0, MPI_COMM_WORLD);

        // print local A's on every process
        for (int p = 0; p < size; p++) {
        	if (rank == p) {
        		printf("la on rank %d:\n", rank);
                for (int i = 0; i < N / procN; i++) {
                    for (int j = 0; j < N / procN; j++) {
                        printf("%f ", lA[j * N / procN + i]);
                    }
                    printf("\n");
                }
            }
        	MPI_Barrier(MPI_COMM_WORLD);
        }
        MPI_Barrier(MPI_COMM_WORLD);

        // write new values in local A's
        for (int i = 0; i < N / procN; i++) {
            for (int j = 0; j < N / procN; j++) {
                lA[j * N / procN + i] = rank;
            }
        }

        // gather all back to master process
        MPI_Gatherv(lA, N*N/(procN*procN), MPI_DOUBLE,
                    gA, sendcounts, displs, subarrtype,
                    0, MPI_COMM_WORLD);

        // print processed global A of process 0
        if (rank == 0) {
            printf("Processed gA is:\n");
            for (int i = 0; i < N; i++) {
                for (int j = 0; j < N; j++) {
                    printf("%f ", gA[j * N + i]);
                }
                printf("\n");
            }
        }

        MPI_Type_free(&subarrtype);

        if (rank == 0) {
            delete gA;
        }

        delete lA;

        MPI_Finalize();

        return 0;
    }

It can be compiled and run using

mpicxx -std=c++11 -o test test.cpp
mpirun -np 4 ./test

For small N=4,...,180 everything goes fine    

    A is:
    0.000000 6.000000 12.000000 18.000000 24.000000 30.000000
    1.000000 7.000000 13.000000 19.000000 25.000000 31.000000
    2.000000 8.000000 14.000000 20.000000 26.000000 32.000000
    3.000000 9.000000 15.000000 21.000000 27.000000 33.000000
    4.000000 10.000000 16.000000 22.000000 28.000000 34.000000
    5.000000 11.000000 17.000000 23.000000 29.000000 35.000000
    la on rank 0:
    0.000000 6.000000 12.000000
    1.000000 7.000000 13.000000
    2.000000 8.000000 14.000000
    la on rank 1:
    3.000000 9.000000 15.000000
    4.000000 10.000000 16.000000
    5.000000 11.000000 17.000000
    la on rank 2:
    18.000000 24.000000 30.000000
    19.000000 25.000000 31.000000
    20.000000 26.000000 32.000000
    la on rank 3:
    21.000000 27.000000 33.000000
    22.000000 28.000000 34.000000
    23.000000 29.000000 35.000000
    Processed gA is:
    0.000000 0.000000 0.000000 2.000000 2.000000 2.000000
    0.000000 0.000000 0.000000 2.000000 2.000000 2.000000
    0.000000 0.000000 0.000000 2.000000 2.000000 2.000000
    1.000000 1.000000 1.000000 3.000000 3.000000 3.000000
    1.000000 1.000000 1.000000 3.000000 3.000000 3.000000
    1.000000 1.000000 1.000000 3.000000 3.000000 3.000000 

Here you see the errors when I use N = 184:

    

    Fatal error in PMPI_Scatterv: Other MPI error, error stack:
    PMPI_Scatterv(655)..............: MPI_Scatterv(sbuf=(nil), scnts=0x7ffee066bad0, displs=0x7ffee066bae0, dtype=USER<resized>, rbuf=0xe9e590, rcount=8464, MPI_DOUBLE, root=0, MPI_COMM_WORLD) failed
    MPIR_Scatterv_impl(205).........: fail failed
    I_MPIR_Scatterv_intra(265)......: Failure during collective
    I_MPIR_Scatterv_intra(259)......: fail failed
    MPIR_Scatterv(141)..............: fail failed
    MPIC_Recv(418)..................: fail failed
    MPIC_Wait(269)..................: fail failed
    PMPIDI_CH3I_Progress(623).......: fail failed
    pkt_RTS_handler(317)............: fail failed
    do_cts(662).....................: fail failed
    MPID_nem_lmt_dcp_start_recv(288): fail failed
    dcp_recv(154)...................: Internal MPI error!  cannot read from remote process
    Fatal error in PMPI_Scatterv: Other MPI error, error stack:
    PMPI_Scatterv(655)..............: MPI_Scatterv(sbuf=(nil), scnts=0x7ffef0de9b50, displs=0x7ffef0de9b60, dtype=USER<resized>, rbuf=0x21a7610, rcount=8464, MPI_DOUBLE, root=0, MPI_COMM_WORLD) failed
    MPIR_Scatterv_impl(205).........: fail failed
    I_MPIR_Scatterv_intra(265)......: Failure during collective
    I_MPIR_Scatterv_intra(259)......: fail failed
    MPIR_Scatterv(141)..............: fail failed
    MPIC_Recv(418)..................: fail failed
    MPIC_Wait(269)..................: fail failed
    PMPIDI_CH3I_Progress(623).......: fail failed
    pkt_RTS_handler(317)............: fail failed
    do_cts(662).....................: fail failed
    MPID_nem_lmt_dcp_start_recv(288): fail failed
    dcp_recv(154)...................: Internal MPI error!  cannot read from remote process

I found some information abut an issue with MPI_Bcast hang on large user defined types, see (https://software.intel.com/en-us/articles/intel-mpi-library-2017-known-issue-mpi-bcast-hang-on-large-user-defined-datatypes) but I'm not sure if its the same for Scatterv and Gatherv. I'm using Intel MPI Library 2017 Update 2 for Linux.

I hope someone knows a sollution for this problem.

Thread Topic: 

Question

How to build connection for two server with infiniband and use intel mpi?

$
0
0

I'm so sorry that i can't find some detail info about using intel mpi to connect two server with infiniband.

I want to know the procedure, and if there is any url about this content?

Please apply me soon, thank you a lot!!!

Zone: 

Thread Topic: 

Help Me

Resetting credentials for MPI project

$
0
0

Hello,

I am trying to run my first MPI project and have a problem with credentials. Long story short, I have entered credentials and then realised I didn`t actually registered. Now, every time I`m starting the project prompt says: Credentials for <username> rejected connecting to <PCname>.

I wonder, how do I change the credentials I`ve entered and should have I registered with some specific name and password depending on PC name, windows username, etc. ?

Thank you.

Thread Topic: 

Help Me

Sending sub-arrays of matrix to different processors using mpi_scatterv

$
0
0

I want to scatter matrix from root to other processors using scatterv. I am creating a communicator topology using mpi_cart_create. As an example I have the below code in fortran: 

PROGRAM SendRecv
USE mpi
IMPLICIT none
integer, PARAMETER :: m = 4, n = 4
integer, DIMENSION(m,n) :: a, b,h
integer :: i,j,count
integer,allocatable, dimension(:,:):: loc   ! local piece of global 2d array
INTEGER :: istatus(MPI_STATUS_SIZE),ierr
integer, dimension(2) :: sizes, subsizes, starts
INTEGER :: ista,iend,jsta,jend,ilen,jlen
INTEGER :: iprocs, jprocs, nprocs
integer,allocatable,dimension(:):: rcounts, displs
INTEGER :: rcounts0,displs0
integer, PARAMETER :: ROOT = 0
integer :: dims(2),coords(2)
logical :: periods(2)
data  periods/2*.false./
integer :: status(MPI_STATUS_SIZE)
integer :: comm2d,source,myrank
integer :: newtype, resizedtype
integer :: comsize,charsize
integer(kind=MPI_ADDRESS_KIND) :: extent, begin

CALL MPI_INIT(ierr)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD, nprocs, ierr)
CALL MPI_COMM_RANK(MPI_COMM_WORLD, myrank, ierr)
! Get a new communicator for a decomposition of the domain.
dims(1) = 0
dims(2) = 0
CALL MPI_DIMS_CREATE(nprocs,2,dims,ierr)
if (myrank.EQ.Root) then
   print *,nprocs,'processors have been arranged into',dims(1),'X',dims(2),'grid'
endif
CALL MPI_CART_CREATE(MPI_COMM_WORLD,2,dims,periods,.true., &
                  comm2d,ierr)
!   Get my position in this communicator
CALL MPI_COMM_RANK(comm2d,myrank,ierr)
! Get the decomposition
CALL fnd2ddecomp(comm2d,m,n,ista,iend,jsta,jend)
! print *,ista,jsta,iend,jend
ilen = iend - ista + 1
jlen = jend - jsta + 1

CALL MPI_Cart_get(comm2d,2,dims,periods,coords,ierr)
iprocs = dims(1)
jprocs = dims(2)
! define the global matrix
if (myrank==ROOT) then
   count = 0
    do j = 1,n
       do i = 1,m
          a(i,j) = count
          count = count+1
       enddo
    enddo
    print *, 'global matrix is: '
    do 90 i=1,m
       do 80 j = 1,n
           write(*,70)a(i,j)
    70     format(2x,I5,$)
    80     continue
           print *, ''
  90    continue
endif
call MPI_Barrier(MPI_COMM_WORLD, ierr)

starts   = [0,0]
sizes    = [m, n]
subsizes = [ilen, jlen]
call MPI_Type_create_subarray(2, sizes, subsizes, starts,        &
                               MPI_ORDER_FORTRAN, MPI_INTEGER,  &
                               newtype, ierr)
call MPI_Type_size(MPI_INTEGER, charsize, ierr)
begin  = 0
extent = charsize
call MPI_Type_create_resized(newtype, begin, extent, resizedtype, ierr)
call MPI_Type_commit(resizedtype, ierr)

! get counts and displacmeents
allocate(rcounts(nprocs),displs(nprocs))
rcounts0 = 1
displs0 = (ista-1) + (jsta-1)*m
CALL MPI_Allgather(rcounts0,1,MPI_INT,rcounts,1,MPI_INT,MPI_COMM_WORLD,IERR)
CALL MPI_Allgather(displs0,1,MPI_INT,displs,1,MPI_INT,MPI_COMM_WORLD,IERR)
CALL MPI_Barrier(MPI_COMM_WORLD, ierr)

! scatter data
allocate(loc(ilen,jlen))
call MPI_Scatterv(a,rcounts,displs,resizedtype,    &
                 loc,ilen*jlen,MPI_INTEGER, &
                  ROOT,MPI_COMM_WORLD,ierr)
! print each processor matrix
do source = 0,nprocs-1
   if (myrank.eq.source) then
       print *,'myrank:',source
       do i=1,ilen
           do j = 1,jlen
              write(*,701)loc(i,j)
701               format(2x,I5,$)
           enddo
       print *, ''
       enddo
    endif
       call MPI_Barrier(MPI_COMM_WORLD, ierr)
enddo

call MPI_Type_free(newtype,ierr)
call MPI_Type_free(resizedtype,ierr)
deallocate(rcounts,displs)
deallocate(loc)

CALL MPI_FINALIZE(ierr)

contains

subroutine fnd2ddecomp(comm2d,m,n,ista,iend,jsta,jend)
integer   comm2d
integer   m,n,ista,jsta,iend,jend
integer   dims(2),coords(2),ierr
logical   periods(2)
! Get (i,j) position of a processor from Cartesian topology.
CALL MPI_Cart_get(comm2d,2,dims,periods,coords,ierr)
! Decomposition in first (ie. X) direction
CALL MPE_DECOMP1D(m,dims(1),coords(1),ista,iend)
! Decomposition in second (ie. Y) direction
CALL MPE_DECOMP1D(n,dims(2),coords(2),jsta,jend)
end subroutine fnd2ddecomp

SUBROUTINE MPE_DECOMP1D(n,numprocs,myid,s,e)
integer n,numprocs,myid,s,e,nlocal,deficit
nlocal  = n / numprocs
s       = myid * nlocal + 1
deficit = mod(n,numprocs)
s       = s + min(myid,deficit)
! Give one more slice to processors
if (myid .lt. deficit) then
    nlocal = nlocal + 1
endif
e = s + nlocal - 1
if (e .gt. n .or. myid .eq. numprocs-1) e = n
end subroutine MPE_DECOMP1D

END program SendRecv

I am generating a 4x4 matrix, and using scatterv I am sending the blocks of matrices to other processors. Code works fine for 4,2 and 16 processors. But throws a error for three processors. What modifications I have to do make it work for any number of given processors. 

 

Global matrix in Root:

[ 0      4      8     12
  1      5      9     13
  2      6     10     14
  3      7     11     15 ]

For 4 processors each processors gets.

Rank =0 : [0 4
          1 5]
Rank =1 : [8 12
          9 13]
Rank =2 : [2 6
          3 7]
Rank =3 : [10 14
          11 15]

Code works for 2 and 16 processors; in fact it works when sub-arrays are of similar size. It fails for 3 processors. For 3 processors I am expecting:

Rank =0 : [0 4 8 12
           1 5 9 13]
Rank =1 : [2 6 10 14]
Rank =2 : [3 7 11 15]

But I am getting the following error:

Fatal error in PMPI_Scatterv: Message truncated, error stack:
PMPI_Scatterv(671)................: MPI_Scatterv(sbuf=0x6b58c0, scnts=0xf95d90, displs=0xfafbe0, dtype=USER<resized>, rbuf=0xfafc00, rcount=4, MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Scatterv_impl(211)...........:
I_MPIR_Scatterv_intra(278)........: Failure during collective
I_MPIR_Scatterv_intra(272)........:
MPIR_Scatterv(147)................:
MPIDI_CH3U_Receive_data_found(131): Message from rank 0 and tag 6 truncated; 32 bytes received but buffer size is 16
Fatal error in PMPI_Scatterv: Message truncated, error stack:
PMPI_Scatterv(671)................: MPI_Scatterv(sbuf=0x6b58c0, scnts=0x240bda0, displs=0x240be60, dtype=USER<resized>, rbuf=0x240be80, rcount=4, MPI_INTEGER, root=0, MPI_COMM_WORLD) failed
MPIR_Scatterv_impl(211)...........:
I_MPIR_Scatterv_intra(278)........: Failure during collective
I_MPIR_Scatterv_intra(272)........:
MPIR_Scatterv(147)................:
MPIDI_CH3U_Receive_data_found(131): Message from rank 0 and tag 6 truncated; 32 bytes received but buffer size is 16
forrtl: error (69): process interrupted (SIGINT)
Image              PC                Routine            Line        Source
a.out              0000000000479165  Unknown               Unknown  Unknown
a.out              0000000000476D87  Unknown               Unknown  Unknown
a.out              000000000044B7C4  Unknown               Unknown  Unknown
a.out              000000000044B5D6  Unknown               Unknown  Unknown
a.out              000000000042DB76  Unknown               Unknown  Unknown
a.out              00000000004053DE  Unknown               Unknown  Unknown
libpthread.so.0    00007F2327456790  Unknown               Unknown  Unknown
libc.so.6          00007F2326EFE2F7  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327B899E8  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327C94E39  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327C94B32  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327B6E44A  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327B6DD5D  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327B6DBDC  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327B6DB0C  Unknown               Unknown  Unknown
libmpi.so.12       00007F2327B6F932  Unknown               Unknown  Unknown
libmpifort.so.12   00007F2328294B1C  Unknown               Unknown  Unknown
a.out              000000000040488B  Unknown               Unknown  Unknown
a.out              000000000040385E  Unknown               Unknown  Unknown
libc.so.6          00007F2326E4DD5D  Unknown               Unknown  Unknown
a.out              0000000000403769  Unknown               Unknown  Unknown

Where I am missing? what modifications I have to make to make it work?

Thread Topic: 

Question

MPI on two machines with different choices of I_MPI_FABRICS

$
0
0

I am trying to run a simple "hello world" MPI program using two machines.

Here is the MPI program:

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
  MPI_Init(NULL, NULL);

  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);

  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name_len);

  printf("Hello world from processor %s, rank %d out of %d processors\n",
         processor_name, world_rank, world_size);

  MPI_Finalize();
}

A few notes on my setup:

  • I am using Mellanox 3.18-2 OFED distribution, as directed by the Xeon Phi "MPSS user's guide"
  • For simplicity, I'm just trying to run MPI between two PC machines (no Xeon Phi involved)
  • Program is compiled using:
    mpiicpc helloMPI.cc -o helloMPI.XEON
  • Both machines are running bare-metal installations of CentOS 7 (i.e. no VMs)

I get different errors depending on some of the I_MPI_* environment variables.

The only configuration that works without any issues is:

export I_MPI_FABRICS=tcp
export I_MPI_MIC=1   # For some reason, I need this, even though I'm not running on a Xeon Phi
mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 ~/helloMPI.XEON

However, if I attempt to use other settings for I_MPI_FABRICS, then I get errors and it doesn't work.
For example, for the "dapl" fabric choice (which is my eventual desired setting):

export I_MPI_FABRICS=dapl
export I_MPI_MIC=1
mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 ~/helloMPI.XEON

 

host: 192.168.1.111
host: 192.168.1.222

==================================================================================================
mpiexec options:
----------------
  Base path: /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/
  Launcher: ssh
  Debug level: 1
  Enable X: -1

  Global environment:
  -------------------
    I_MPI_PERHOST=allcores
    LD_LIBRARY_PATH=/home/james/phi/build/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib
    MKLROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl
    MANPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man:/home/james/.fzf/man
    XDG_SESSION_ID=150
    SPARK_HOME=/home/james/clone/benchmark_spark/spark
    HOSTNAME=xen2
    SELINUX_ROLE_REQUESTED=
    INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.3.191/linux/licenses:/opt/intel/licenses:/home/james/intel/licenses
    IPPROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp
    TERM=xterm-256color
    SHELL=/bin/zsh
    I_MPI_FABRICS=dapl
    HADOOP_HOME=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1
    HISTSIZE=10000
    I_MPI_MIC=1
    KVM=/home/james/phi/modules/linux/custom/kvm
    SSH_CLIENT=10.70.2.94 46738 22
    LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4
    BENCH=/home/james/clone/benchmark_spark
    SELINUX_USE_CURRENT_RANGE=
    COI=/home/james/phi/src/mpss/mpss-coi-3.8.1
    MIC_LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic
    SSH_TTY=/dev/pts/0
    VTUNE_AMPLIFIER_XE_2017_DIR=/opt/intel/vtune_amplifier_xe_2017.0.2.478468
    ZSH=/home/james/.oh-my-zsh
    QT_GRAPHICSSYSTEM_CHECKED=1
    MPSS=/home/james/phi/src/mpss/mpss-3.8.1
    USER=james
    C_BOOT=/home/james/phi/modules/linux/custom/kvm/centos_boot
    LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:
    I_MPI_MPIRUN=mpirun
    MIC_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic
    CPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include
    PAGER=less
    MAVEN_OPTS=-Xmx2g -XX:ReservedCodeCacheSize=512m
    _PHI_ROOT=/home/james/phi
    LSCOLORS=Gxfxcxdxbxegedabagacad
    _INTEL_SOURCE_ME=yes
    NLSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin/locale/%l_%t/%N
    MAIL=/var/spool/mail/james
    PATH=/opt/intel/vtune_amplifier_xe_2017.0.2.478468/bin64:/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/bin:/home/james/clone/benchmark_spark/spark/bin:/home/james/maven/apache-maven-3.3.9/bin:/home/james/java/jdk1.8.0_111/bin:/home/james/bin:/home/james/local/bin:/home/james/.fzf:/home/james/phi/src/python:/home/james/phi/src/sh:/home/james/phi/build/bin:/home/james/phi/modules/linux/custom/custom_scripts:/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin:/usr/local/bin:/usr/bin:/home/james/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/home/james/.rvm/bin:/home/james/gopath/bin:/home/james/.fzf/bin
    FZF_COMPLETION_TRIGGER=##
    TBBROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb
    I_MPI_HYDRA_DEBUG=on
    PHI=/home/james/phi
    PWD=/home/james
    JAVA_HOME=/home/james/java/jdk1.8.0_111
    EDITOR=vim
    HADOOP_CONF_DIR=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/etc/hadoop
    KERN=/home/james/phi/modules/linux
    LANG=en_CA.UTF-8
    NODE_PATH=/home/james/.jsctags/lib/jsctags/:
    SELINUX_LEVEL_REQUESTED=
    DAALROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/daal
    HISTCONTROL=ignoredups
    MOD=/home/james/phi/src/mpss/mpss-modules-srpm-3.8.1
    C_MNT=/home/james/phi/modules/linux/custom/kvm/centos_root
    SHLVL=2
    HOME=/home/james
    GOROOT=/home/james/golang
    I_MPI_DEBUG=6
    PYTHONPATH=.:/home/james/python:/home/james/.vim/src/python:/home/james/phi/build/local/lib/python2.7/site-packages:/home/james/phi/src/python
    LESS=-R
    LOGNAME=james
    CLASSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/daal.jar
    SSH_CONNECTION=10.70.2.94 46738 10.70.2.83 22
    LC_CTYPE=en_CA.UTF-8
    GOPATH=/home/james/gopath
    LESSOPEN=||/usr/bin/lesspipe.sh %s
    _PHI_MPSS_SRC=/home/james/phi/src/mpss
    CMAKE_PREFIX_PATH=/home/james/phi/build:
    XDG_RUNTIME_DIR=/run/user/1000
    I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi
    _=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/mpiexec.hydra

  Hydra internal environment:
  ---------------------------
    MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1
    GFORTRAN_UNBUFFERED_PRECONNECTED=y
    I_MPI_HYDRA_UUID=653e0000-4054-4f5d-3551-050001dec0a8
    DAPL_NETWORK_PROCESS_NUM=2

  Intel(R) MPI Library specific variables:
  ----------------------------------------
    I_MPI_PERHOST=allcores
    I_MPI_FABRICS=dapl
    I_MPI_MIC=1
    I_MPI_MPIRUN=mpirun
    I_MPI_HYDRA_DEBUG=on
    I_MPI_DEBUG=6
    I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi
    I_MPI_HYDRA_UUID=653e0000-4054-4f5d-3551-050001dec0a8


    Proxy information:
    *********************
      [1] proxy: 192.168.1.111 (1 cores)
      Exec list: /home/james/helloMPI.XEON (1 processes);

      [2] proxy: 192.168.1.222 (1 cores)
      Exec list: /home/james/helloMPI.XEON (1 processes);


==================================================================================================

[mpiexec@xen2] Timeout set to -1 (-1 means infinite)
[mpiexec@xen2] Got a control port string of 192.168.1.222:35852

Proxy launch args: pmi_proxy --control-port 192.168.1.222:35852 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --enable-mic --i_mpi_base_path /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/ --i_mpi_base_arch 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 855354664 --usize -2 --proxy-id

Arguments being passed to proxy 0:
--version 3.2 --iface-ip-env-name MPIR_CVAR_CH3_INTERFACE_HOSTNAME --hostname 192.168.1.111 --global-core-map 0,1,2 --pmi-id-map 0,0 --global-process-count 2 --auto-cleanup 1 --pmi-kvsname kvs_15973_0 --pmi-process-mapping (vector,(0,2,1)) --topolib ipl --ckpointlib blcr --ckpoint-prefix /tmp --ckpoint-preserve 1 --ckpoint off --ckpoint-num -1 --global-inherited-env 75 'I_MPI_PERHOST=allcores''LD_LIBRARY_PATH=/home/james/phi/build/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib''MKLROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl''MANPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man:/home/james/.fzf/man''XDG_SESSION_ID=150''SPARK_HOME=/home/james/clone/benchmark_spark/spark''HOSTNAME=xen2''SELINUX_ROLE_REQUESTED=''INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.3.191/linux/licenses:/opt/intel/licenses:/home/james/intel/licenses''IPPROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp''TERM=xterm-256color''SHELL=/bin/zsh''I_MPI_FABRICS=dapl''HADOOP_HOME=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1''HISTSIZE=10000''I_MPI_MIC=1''KVM=/home/james/phi/modules/linux/custom/kvm''SSH_CLIENT=10.70.2.94 46738 22''LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4''BENCH=/home/james/clone/benchmark_spark''SELINUX_USE_CURRENT_RANGE=''COI=/home/james/phi/src/mpss/mpss-coi-3.8.1''MIC_LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''SSH_TTY=/dev/pts/0''VTUNE_AMPLIFIER_XE_2017_DIR=/opt/intel/vtune_amplifier_xe_2017.0.2.478468''ZSH=/home/james/.oh-my-zsh''QT_GRAPHICSSYSTEM_CHECKED=1''MPSS=/home/james/phi/src/mpss/mpss-3.8.1''USER=james''C_BOOT=/home/james/phi/modules/linux/custom/kvm/centos_boot''LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:''I_MPI_MPIRUN=mpirun''MIC_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''CPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include''PAGER=less''MAVEN_OPTS=-Xmx2g -XX:ReservedCodeCacheSize=512m''_PHI_ROOT=/home/james/phi''LSCOLORS=Gxfxcxdxbxegedabagacad''_INTEL_SOURCE_ME=yes''NLSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin/locale/%l_%t/%N''MAIL=/var/spool/mail/james''PATH=/opt/intel/vtune_amplifier_xe_2017.0.2.478468/bin64:/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/bin:/home/james/clone/benchmark_spark/spark/bin:/home/james/maven/apache-maven-3.3.9/bin:/home/james/java/jdk1.8.0_111/bin:/home/james/bin:/home/james/local/bin:/home/james/.fzf:/home/james/phi/src/python:/home/james/phi/src/sh:/home/james/phi/build/bin:/home/james/phi/modules/linux/custom/custom_scripts:/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin:/usr/local/bin:/usr/bin:/home/james/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/home/james/.rvm/bin:/home/james/gopath/bin:/home/james/.fzf/bin''FZF_COMPLETION_TRIGGER=##''TBBROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb''I_MPI_HYDRA_DEBUG=on''PHI=/home/james/phi''PWD=/home/james''JAVA_HOME=/home/james/java/jdk1.8.0_111''EDITOR=vim''HADOOP_CONF_DIR=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/etc/hadoop''KERN=/home/james/phi/modules/linux''LANG=en_CA.UTF-8''NODE_PATH=/home/james/.jsctags/lib/jsctags/:''SELINUX_LEVEL_REQUESTED=''DAALROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/daal''HISTCONTROL=ignoredups''MOD=/home/james/phi/src/mpss/mpss-modules-srpm-3.8.1''C_MNT=/home/james/phi/modules/linux/custom/kvm/centos_root''SHLVL=2''HOME=/home/james''GOROOT=/home/james/golang''I_MPI_DEBUG=6''PYTHONPATH=.:/home/james/python:/home/james/.vim/src/python:/home/james/phi/build/local/lib/python2.7/site-packages:/home/james/phi/src/python''LESS=-R''LOGNAME=james''CLASSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/daal.jar''SSH_CONNECTION=10.70.2.94 46738 10.70.2.83 22''LC_CTYPE=en_CA.UTF-8''GOPATH=/home/james/gopath''LESSOPEN=||/usr/bin/lesspipe.sh %s''_PHI_MPSS_SRC=/home/james/phi/src/mpss''CMAKE_PREFIX_PATH=/home/james/phi/build:''XDG_RUNTIME_DIR=/run/user/1000''I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi''_=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/mpiexec.hydra' --global-user-env 0 --global-system-env 4 'MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1''GFORTRAN_UNBUFFERED_PRECONNECTED=y''I_MPI_HYDRA_UUID=653e0000-4054-4f5d-3551-050001dec0a8''DAPL_NETWORK_PROCESS_NUM=2' --proxy-core-count 1 --mpi-cmd-env mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 /home/james/helloMPI.XEON  --exec --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/james --exec-args 1 /home/james/helloMPI.XEON

Arguments being passed to proxy 1:
--version 3.2 --iface-ip-env-name MPIR_CVAR_CH3_INTERFACE_HOSTNAME --hostname 192.168.1.222 --global-core-map 0,1,2 --pmi-id-map 0,1 --global-process-count 2 --auto-cleanup 1 --pmi-kvsname kvs_15973_0 --pmi-process-mapping (vector,(0,2,1)) --topolib ipl --ckpointlib blcr --ckpoint-prefix /tmp --ckpoint-preserve 1 --ckpoint off --ckpoint-num -1 --global-inherited-env 75 'I_MPI_PERHOST=allcores''LD_LIBRARY_PATH=/home/james/phi/build/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib''MKLROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl''MANPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man:/home/james/.fzf/man''XDG_SESSION_ID=150''SPARK_HOME=/home/james/clone/benchmark_spark/spark''HOSTNAME=xen2''SELINUX_ROLE_REQUESTED=''INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.3.191/linux/licenses:/opt/intel/licenses:/home/james/intel/licenses''IPPROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp''TERM=xterm-256color''SHELL=/bin/zsh''I_MPI_FABRICS=dapl''HADOOP_HOME=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1''HISTSIZE=10000''I_MPI_MIC=1''KVM=/home/james/phi/modules/linux/custom/kvm''SSH_CLIENT=10.70.2.94 46738 22''LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4''BENCH=/home/james/clone/benchmark_spark''SELINUX_USE_CURRENT_RANGE=''COI=/home/james/phi/src/mpss/mpss-coi-3.8.1''MIC_LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''SSH_TTY=/dev/pts/0''VTUNE_AMPLIFIER_XE_2017_DIR=/opt/intel/vtune_amplifier_xe_2017.0.2.478468''ZSH=/home/james/.oh-my-zsh''QT_GRAPHICSSYSTEM_CHECKED=1''MPSS=/home/james/phi/src/mpss/mpss-3.8.1''USER=james''C_BOOT=/home/james/phi/modules/linux/custom/kvm/centos_boot''LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:''I_MPI_MPIRUN=mpirun''MIC_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''CPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include''PAGER=less''MAVEN_OPTS=-Xmx2g -XX:ReservedCodeCacheSize=512m''_PHI_ROOT=/home/james/phi''LSCOLORS=Gxfxcxdxbxegedabagacad''_INTEL_SOURCE_ME=yes''NLSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin/locale/%l_%t/%N''MAIL=/var/spool/mail/james''PATH=/opt/intel/vtune_amplifier_xe_2017.0.2.478468/bin64:/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/bin:/home/james/clone/benchmark_spark/spark/bin:/home/james/maven/apache-maven-3.3.9/bin:/home/james/java/jdk1.8.0_111/bin:/home/james/bin:/home/james/local/bin:/home/james/.fzf:/home/james/phi/src/python:/home/james/phi/src/sh:/home/james/phi/build/bin:/home/james/phi/modules/linux/custom/custom_scripts:/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin:/usr/local/bin:/usr/bin:/home/james/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/home/james/.rvm/bin:/home/james/gopath/bin:/home/james/.fzf/bin''FZF_COMPLETION_TRIGGER=##''TBBROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb''I_MPI_HYDRA_DEBUG=on''PHI=/home/james/phi''PWD=/home/james''JAVA_HOME=/home/james/java/jdk1.8.0_111''EDITOR=vim''HADOOP_CONF_DIR=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/etc/hadoop''KERN=/home/james/phi/modules/linux''LANG=en_CA.UTF-8''NODE_PATH=/home/james/.jsctags/lib/jsctags/:''SELINUX_LEVEL_REQUESTED=''DAALROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/daal''HISTCONTROL=ignoredups''MOD=/home/james/phi/src/mpss/mpss-modules-srpm-3.8.1''C_MNT=/home/james/phi/modules/linux/custom/kvm/centos_root''SHLVL=2''HOME=/home/james''GOROOT=/home/james/golang''I_MPI_DEBUG=6''PYTHONPATH=.:/home/james/python:/home/james/.vim/src/python:/home/james/phi/build/local/lib/python2.7/site-packages:/home/james/phi/src/python''LESS=-R''LOGNAME=james''CLASSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/daal.jar''SSH_CONNECTION=10.70.2.94 46738 10.70.2.83 22''LC_CTYPE=en_CA.UTF-8''GOPATH=/home/james/gopath''LESSOPEN=||/usr/bin/lesspipe.sh %s''_PHI_MPSS_SRC=/home/james/phi/src/mpss''CMAKE_PREFIX_PATH=/home/james/phi/build:''XDG_RUNTIME_DIR=/run/user/1000''I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi''_=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/mpiexec.hydra' --global-user-env 0 --global-system-env 4 'MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1''GFORTRAN_UNBUFFERED_PRECONNECTED=y''I_MPI_HYDRA_UUID=653e0000-4054-4f5d-3551-050001dec0a8''DAPL_NETWORK_PROCESS_NUM=2' --proxy-core-count 1 --mpi-cmd-env mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 /home/james/helloMPI.XEON  --exec --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/james --exec-args 1 /home/james/helloMPI.XEON

[mpiexec@xen2] Launch arguments: /usr/bin/ssh -x -q 192.168.1.111 sh -c 'export I_MPI_ROOT="/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi" ; export PATH="/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/:${I_MPI_ROOT}/intel64/bin:${PATH}" ; exec "$0""$@"' pmi_proxy --control-port 192.168.1.222:35852 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --enable-mic --i_mpi_base_path /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/ --i_mpi_base_arch 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 855354664 --usize -2 --proxy-id 0
[mpiexec@xen2] Launch arguments: pmi_proxy --control-port 192.168.1.222:35852 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --enable-mic --i_mpi_base_path /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/ --i_mpi_base_arch 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 855354664 --usize -2 --proxy-id 1
[proxy:0:1@xen2] Start PMI_proxy 1
[proxy:0:1@xen2] got pmi command (from 9): init
pmi_version=1 pmi_subversion=1
[proxy:0:1@xen2] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:1@xen2] got pmi command (from 9): get_maxes

[proxy:0:1@xen2] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:1@xen2] got pmi command (from 9): barrier_in

[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[proxy:0:1@xen2] forwarding command (cmd=barrier_in) upstream
[proxy:0:0@xen1] Start PMI_proxy 0
[proxy:0:0@xen1] STDIN will be redirected to 1 fd(s): 9
[proxy:0:0@xen1] got pmi command (from 6): init
pmi_version=1 pmi_subversion=1
[proxy:0:0@xen1] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@xen1] got pmi command (from 6): get_maxes

[proxy:0:0@xen1] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] PMI response to fd 11 pid 6: cmd=barrier_out
[mpiexec@xen2] PMI response to fd 8 pid 6: cmd=barrier_out
[proxy:0:0@xen1] got pmi command (from 6): barrier_in

[proxy:0:0@xen1] forwarding command (cmd=barrier_in) upstream
[proxy:0:1@xen2] PMI response: cmd=barrier_out
[proxy:0:1@xen2] got pmi command (from 9): get_ranks2hosts

[proxy:0:1@xen2] PMI response: put_ranks2hosts 42 2
13 192.168.1.111 0, 13 192.168.1.222 1,
[proxy:0:0@xen1] PMI response: cmd=barrier_out
[proxy:0:1@xen2] got pmi command (from 9): get_appnum

[proxy:0:1@xen2] PMI response: cmd=appnum appnum=0
[proxy:0:0@xen1] got pmi command (from 6): get_ranks2hosts

[proxy:0:0@xen1] PMI response: put_ranks2hosts 42 2
13 192.168.1.111 0, 13 192.168.1.222 1,
[proxy:0:1@xen2] got pmi command (from 9): get_my_kvsname

[proxy:0:1@xen2] PMI response: cmd=my_kvsname kvsname=kvs_15973_0
[proxy:0:0@xen1] got pmi command (from 6): get_appnum

[proxy:0:0@xen1] PMI response: cmd=appnum appnum=0
[proxy:0:0@xen1] got pmi command (from 6): get_my_kvsname

[proxy:0:0@xen1] PMI response: cmd=my_kvsname kvsname=kvs_15973_0
[proxy:0:1@xen2] got pmi command (from 9): get_my_kvsname

[proxy:0:1@xen2] PMI response: cmd=my_kvsname kvsname=kvs_15973_0
[proxy:0:0@xen1] got pmi command (from 6): get_my_kvsname

[proxy:0:0@xen1] PMI response: cmd=my_kvsname kvsname=kvs_15973_0
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2  Build 20170125 (id: 16752)
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[proxy:0:1@xen2] got pmi command (from 9): barrier_in

[proxy:0:1@xen2] forwarding command (cmd=barrier_in) upstream
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] PMI response to fd 11 pid 6: cmd=barrier_out
[mpiexec@xen2] PMI response to fd 8 pid 6: cmd=barrier_out
[proxy:0:0@xen1] got pmi command (from 6): barrier_in

[proxy:0:0@xen1] forwarding command (cmd=barrier_in) upstream
[proxy:0:1@xen2] PMI response: cmd=barrier_out
[proxy:0:0@xen1] PMI response: cmd=barrier_out
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u
[0] MPI startup(): DAPL provider ofa-v2-mlx4_0-1u
[0] MPI startup(): dapl data transfer mode
[mpiexec@xen2] [pgid: -1] got PMI command: cmd=put kvsname=kvs_15973_0 key=P0-businesscard-0 value=rdma_port0#23708$rdma_host0#0A0000000000029EFE80000000000000E61D2DFFFE129B4000000004$arch_code#6$fabrics_list#dapl$
[proxy:0:0@xen1] got pmi command (from 6): put
kvsname=kvs_15973_0 key=P0-businesscard-0 value=rdma_port0#23708$rdma_host0#0A0000000000029EFE80000000000000E61D2DFFFE129B4000000004$arch_code#6$fabrics_list#dapl$
[proxy:0:0@xen1] PMI response: cmd=put_result rc=0 msg=success
[proxy:0:0@xen1] got pmi command (from 6): barrier_in

[proxy:0:0@xen1] forwarding command (cmd=barrier_in) upstream
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[1] DAPL startup(): trying to open first DAPL provider from I_MPI_DAPL_PROVIDER_LIST: ofa-v2-mlx4_0-2u
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-2u
[1] DAPL startup(): trying to open secondary (2) DAPL provider from I_MPI_DAPL_PROVIDER_LIST: ofa-v2-mcm-2
[1] MPI startup(): secondary DAPL provider ofa-v2-mcm-2
[proxy:0:1@xen2] got pmi command (from 9): put
kvsname=kvs_15973_0 key=P1-businesscard-0 value=rdma_port0#15982$rdma_host0#0A00000000000327FE80000000000000E61D2DFFFE129B1100000004$rdma_port1#15982$rdma_host1#0A00000000000329FE80000000000000E61D2DFFFE129B1100010004$arch_code#8$fabrics_list#dapl$
[proxy:0:1@xen2] PMI response: cmd=put_result rc=0 msg=success
[1] MPI startup(): dapl data transfer mode
[proxy:0:1@xen2] got pmi command (from 9): barrier_in

[proxy:0:1@xen2] forwarding command (cmd=barrier_in) upstream
[mpiexec@xen2] [pgid: -1] got PMI command: cmd=put kvsname=kvs_15973_0 key=P1-businesscard-0 value=rdma_port0#15982$rdma_host0#0A00000000000327FE80000000000000E61D2DFFFE129B1100000004$rdma_port1#15982$rdma_host1#0A00000000000329FE80000000000000E61D2DFFFE129B1100010004$arch_code#8$fabrics_list#dapl$
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] PMI response to fd 11 pid 9: cmd=barrier_out
[mpiexec@xen2] PMI response to fd 8 pid 9: cmd=barrier_out
[proxy:0:1@xen2] PMI response: cmd=barrier_out
[proxy:0:0@xen1] PMI response: cmd=barrier_out
[proxy:0:0@xen1] got pmi command (from 6): get
kvsname=kvs_15973_0 key=P1-businesscard-0
[proxy:0:0@xen1] PMI response: cmd=get_result rc=0 msg=success value=rdma_port0#15982$rdma_host0#0A00000000000327FE80000000000000E61D2DFFFE129B1100000004$rdma_port1#15982$rdma_host1#0A00000000000329FE80000000000000E61D2DFFFE129B1100010004$arch_code#8$fabrics_list#dapl$
xen1:UCM:5c9c:194f1740: 21010 us(21010 us):  create_ah: ERR Success
xen1:UCM:5c9c:194f1740: 21019 us(9 us): UCM connect: snd ERR -> cm_lid 0 cm_qpn 327 r_psp 3e6e p_sz=24
libibverbs: GRH is mandatory For RoCE address handle
[0:192.168.1.111][../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_conn_rc.c:247] error(0x30000): ofa-v2-mlx4_0-1u: could not connect DAPL endpoints: DAT_INSUFFICIENT_RESOURCES()
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=abort exitcode=68729104
Fatal error in MPI_Init: Internal MPI error!, error stack:
MPIR_Init_thread(805).................: fail failed
MPID_Init(1831).......................: channel initialization failed
MPIDI_CH3_Init(147)...................: fail failed
dapl_rc_setup_all_connections_20(1434): generic failure with errno = 16
(unknown)(): Internal MPI error!
[proxy:0:0@xen1] got pmi command (from 6): abort
exitcode=68729104
[proxy:0:0@xen1] we don't understand this command abort; forwarding upstream

 

 

The "ofa" fabric also does not work, although it has a different error message:

export I_MPI_FABRICS=ofa
export I_MPI_MIC=1
mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 ~/helloMPI.XEON

 

 

host: 192.168.1.111
host: 192.168.1.222

==================================================================================================
mpiexec options:
----------------
  Base path: /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/
  Launcher: ssh
  Debug level: 1
  Enable X: -1

  Global environment:
  -------------------
    I_MPI_PERHOST=allcores
    LD_LIBRARY_PATH=/home/james/phi/build/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib
    MKLROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl
    MANPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man:/home/james/.fzf/man
    XDG_SESSION_ID=150
    SPARK_HOME=/home/james/clone/benchmark_spark/spark
    HOSTNAME=xen2
    SELINUX_ROLE_REQUESTED=
    INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.3.191/linux/licenses:/opt/intel/licenses:/home/james/intel/licenses
    IPPROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp
    TERM=xterm-256color
    SHELL=/bin/zsh
    I_MPI_FABRICS=ofa
    HADOOP_HOME=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1
    HISTSIZE=10000
    I_MPI_MIC=1
    KVM=/home/james/phi/modules/linux/custom/kvm
    SSH_CLIENT=10.70.2.94 46738 22
    LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4
    BENCH=/home/james/clone/benchmark_spark
    SELINUX_USE_CURRENT_RANGE=
    COI=/home/james/phi/src/mpss/mpss-coi-3.8.1
    MIC_LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic
    SSH_TTY=/dev/pts/0
    VTUNE_AMPLIFIER_XE_2017_DIR=/opt/intel/vtune_amplifier_xe_2017.0.2.478468
    ZSH=/home/james/.oh-my-zsh
    QT_GRAPHICSSYSTEM_CHECKED=1
    MPSS=/home/james/phi/src/mpss/mpss-3.8.1
    USER=james
    C_BOOT=/home/james/phi/modules/linux/custom/kvm/centos_boot
    LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:
    I_MPI_MPIRUN=mpirun
    MIC_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic
    CPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include
    PAGER=less
    MAVEN_OPTS=-Xmx2g -XX:ReservedCodeCacheSize=512m
    _PHI_ROOT=/home/james/phi
    LSCOLORS=Gxfxcxdxbxegedabagacad
    _INTEL_SOURCE_ME=yes
    NLSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin/locale/%l_%t/%N
    MAIL=/var/spool/mail/james
    PATH=/opt/intel/vtune_amplifier_xe_2017.0.2.478468/bin64:/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/bin:/home/james/clone/benchmark_spark/spark/bin:/home/james/maven/apache-maven-3.3.9/bin:/home/james/java/jdk1.8.0_111/bin:/home/james/bin:/home/james/local/bin:/home/james/.fzf:/home/james/phi/src/python:/home/james/phi/src/sh:/home/james/phi/build/bin:/home/james/phi/modules/linux/custom/custom_scripts:/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin:/usr/local/bin:/usr/bin:/home/james/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/home/james/.rvm/bin:/home/james/gopath/bin:/home/james/.fzf/bin
    FZF_COMPLETION_TRIGGER=##
    TBBROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb
    I_MPI_HYDRA_DEBUG=on
    PHI=/home/james/phi
    PWD=/home/james
    JAVA_HOME=/home/james/java/jdk1.8.0_111
    EDITOR=vim
    HADOOP_CONF_DIR=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/etc/hadoop
    KERN=/home/james/phi/modules/linux
    LANG=en_CA.UTF-8
    NODE_PATH=/home/james/.jsctags/lib/jsctags/:
    SELINUX_LEVEL_REQUESTED=
    DAALROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/daal
    HISTCONTROL=ignoredups
    MOD=/home/james/phi/src/mpss/mpss-modules-srpm-3.8.1
    C_MNT=/home/james/phi/modules/linux/custom/kvm/centos_root
    SHLVL=2
    HOME=/home/james
    GOROOT=/home/james/golang
    I_MPI_DEBUG=6
    PYTHONPATH=.:/home/james/python:/home/james/.vim/src/python:/home/james/phi/build/local/lib/python2.7/site-packages:/home/james/phi/src/python
    LESS=-R
    LOGNAME=james
    CLASSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/daal.jar
    SSH_CONNECTION=10.70.2.94 46738 10.70.2.83 22
    LC_CTYPE=en_CA.UTF-8
    GOPATH=/home/james/gopath
    LESSOPEN=||/usr/bin/lesspipe.sh %s
    _PHI_MPSS_SRC=/home/james/phi/src/mpss
    CMAKE_PREFIX_PATH=/home/james/phi/build:
    XDG_RUNTIME_DIR=/run/user/1000
    I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi
    _=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/mpiexec.hydra

  Hydra internal environment:
  ---------------------------
    MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1
    GFORTRAN_UNBUFFERED_PRECONNECTED=y
    I_MPI_HYDRA_UUID=bc3e0000-f374-a462-3551-050001dec0a8
    DAPL_NETWORK_PROCESS_NUM=2

  Intel(R) MPI Library specific variables:
  ----------------------------------------
    I_MPI_PERHOST=allcores
    I_MPI_FABRICS=ofa
    I_MPI_MIC=1
    I_MPI_MPIRUN=mpirun
    I_MPI_HYDRA_DEBUG=on
    I_MPI_DEBUG=6
    I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi
    I_MPI_HYDRA_UUID=bc3e0000-f374-a462-3551-050001dec0a8


    Proxy information:
    *********************
      [1] proxy: 192.168.1.111 (1 cores)
      Exec list: /home/james/helloMPI.XEON (1 processes);

      [2] proxy: 192.168.1.222 (1 cores)
      Exec list: /home/james/helloMPI.XEON (1 processes);


==================================================================================================

[mpiexec@xen2] Timeout set to -1 (-1 means infinite)
[mpiexec@xen2] Got a control port string of 192.168.1.222:36282

Proxy launch args: pmi_proxy --control-port 192.168.1.222:36282 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --enable-mic --i_mpi_base_path /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/ --i_mpi_base_arch 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 629081922 --usize -2 --proxy-id

Arguments being passed to proxy 0:
--version 3.2 --iface-ip-env-name MPIR_CVAR_CH3_INTERFACE_HOSTNAME --hostname 192.168.1.111 --global-core-map 0,1,2 --pmi-id-map 0,0 --global-process-count 2 --auto-cleanup 1 --pmi-kvsname kvs_16060_0 --pmi-process-mapping (vector,(0,2,1)) --topolib ipl --ckpointlib blcr --ckpoint-prefix /tmp --ckpoint-preserve 1 --ckpoint off --ckpoint-num -1 --global-inherited-env 75 'I_MPI_PERHOST=allcores''LD_LIBRARY_PATH=/home/james/phi/build/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib''MKLROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl''MANPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man:/home/james/.fzf/man''XDG_SESSION_ID=150''SPARK_HOME=/home/james/clone/benchmark_spark/spark''HOSTNAME=xen2''SELINUX_ROLE_REQUESTED=''INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.3.191/linux/licenses:/opt/intel/licenses:/home/james/intel/licenses''IPPROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp''TERM=xterm-256color''SHELL=/bin/zsh''I_MPI_FABRICS=ofa''HADOOP_HOME=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1''HISTSIZE=10000''I_MPI_MIC=1''KVM=/home/james/phi/modules/linux/custom/kvm''SSH_CLIENT=10.70.2.94 46738 22''LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4''BENCH=/home/james/clone/benchmark_spark''SELINUX_USE_CURRENT_RANGE=''COI=/home/james/phi/src/mpss/mpss-coi-3.8.1''MIC_LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''SSH_TTY=/dev/pts/0''VTUNE_AMPLIFIER_XE_2017_DIR=/opt/intel/vtune_amplifier_xe_2017.0.2.478468''ZSH=/home/james/.oh-my-zsh''QT_GRAPHICSSYSTEM_CHECKED=1''MPSS=/home/james/phi/src/mpss/mpss-3.8.1''USER=james''C_BOOT=/home/james/phi/modules/linux/custom/kvm/centos_boot''LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:''I_MPI_MPIRUN=mpirun''MIC_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''CPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include''PAGER=less''MAVEN_OPTS=-Xmx2g -XX:ReservedCodeCacheSize=512m''_PHI_ROOT=/home/james/phi''LSCOLORS=Gxfxcxdxbxegedabagacad''_INTEL_SOURCE_ME=yes''NLSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin/locale/%l_%t/%N''MAIL=/var/spool/mail/james''PATH=/opt/intel/vtune_amplifier_xe_2017.0.2.478468/bin64:/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/bin:/home/james/clone/benchmark_spark/spark/bin:/home/james/maven/apache-maven-3.3.9/bin:/home/james/java/jdk1.8.0_111/bin:/home/james/bin:/home/james/local/bin:/home/james/.fzf:/home/james/phi/src/python:/home/james/phi/src/sh:/home/james/phi/build/bin:/home/james/phi/modules/linux/custom/custom_scripts:/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin:/usr/local/bin:/usr/bin:/home/james/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/home/james/.rvm/bin:/home/james/gopath/bin:/home/james/.fzf/bin''FZF_COMPLETION_TRIGGER=##''TBBROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb''I_MPI_HYDRA_DEBUG=on''PHI=/home/james/phi''PWD=/home/james''JAVA_HOME=/home/james/java/jdk1.8.0_111''EDITOR=vim''HADOOP_CONF_DIR=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/etc/hadoop''KERN=/home/james/phi/modules/linux''LANG=en_CA.UTF-8''NODE_PATH=/home/james/.jsctags/lib/jsctags/:''SELINUX_LEVEL_REQUESTED=''DAALROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/daal''HISTCONTROL=ignoredups''MOD=/home/james/phi/src/mpss/mpss-modules-srpm-3.8.1''C_MNT=/home/james/phi/modules/linux/custom/kvm/centos_root''SHLVL=2''HOME=/home/james''GOROOT=/home/james/golang''I_MPI_DEBUG=6''PYTHONPATH=.:/home/james/python:/home/james/.vim/src/python:/home/james/phi/build/local/lib/python2.7/site-packages:/home/james/phi/src/python''LESS=-R''LOGNAME=james''CLASSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/daal.jar''SSH_CONNECTION=10.70.2.94 46738 10.70.2.83 22''LC_CTYPE=en_CA.UTF-8''GOPATH=/home/james/gopath''LESSOPEN=||/usr/bin/lesspipe.sh %s''_PHI_MPSS_SRC=/home/james/phi/src/mpss''CMAKE_PREFIX_PATH=/home/james/phi/build:''XDG_RUNTIME_DIR=/run/user/1000''I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi''_=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/mpiexec.hydra' --global-user-env 0 --global-system-env 4 'MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1''GFORTRAN_UNBUFFERED_PRECONNECTED=y''I_MPI_HYDRA_UUID=bc3e0000-f374-a462-3551-050001dec0a8''DAPL_NETWORK_PROCESS_NUM=2' --proxy-core-count 1 --mpi-cmd-env mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 /home/james/helloMPI.XEON  --exec --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/james --exec-args 1 /home/james/helloMPI.XEON

Arguments being passed to proxy 1:
--version 3.2 --iface-ip-env-name MPIR_CVAR_CH3_INTERFACE_HOSTNAME --hostname 192.168.1.222 --global-core-map 0,1,2 --pmi-id-map 0,1 --global-process-count 2 --auto-cleanup 1 --pmi-kvsname kvs_16060_0 --pmi-process-mapping (vector,(0,2,1)) --topolib ipl --ckpointlib blcr --ckpoint-prefix /tmp --ckpoint-preserve 1 --ckpoint off --ckpoint-num -1 --global-inherited-env 75 'I_MPI_PERHOST=allcores''LD_LIBRARY_PATH=/home/james/phi/build/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/mic/coi/host-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib''MKLROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl''MANPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man:/home/james/.fzf/man''XDG_SESSION_ID=150''SPARK_HOME=/home/james/clone/benchmark_spark/spark''HOSTNAME=xen2''SELINUX_ROLE_REQUESTED=''INTEL_LICENSE_FILE=/opt/intel/compilers_and_libraries_2017.3.191/linux/licenses:/opt/intel/licenses:/home/james/intel/licenses''IPPROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp''TERM=xterm-256color''SHELL=/bin/zsh''I_MPI_FABRICS=ofa''HADOOP_HOME=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1''HISTSIZE=10000''I_MPI_MIC=1''KVM=/home/james/phi/modules/linux/custom/kvm''SSH_CLIENT=10.70.2.94 46738 22''LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/intel64/gcc4.7:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/../tbb/lib/intel64_lin/gcc4.4''BENCH=/home/james/clone/benchmark_spark''SELINUX_USE_CURRENT_RANGE=''COI=/home/james/phi/src/mpss/mpss-coi-3.8.1''MIC_LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''SSH_TTY=/dev/pts/0''VTUNE_AMPLIFIER_XE_2017_DIR=/opt/intel/vtune_amplifier_xe_2017.0.2.478468''ZSH=/home/james/.oh-my-zsh''QT_GRAPHICSSYSTEM_CHECKED=1''MPSS=/home/james/phi/src/mpss/mpss-3.8.1''USER=james''C_BOOT=/home/james/phi/modules/linux/custom/kvm/centos_boot''LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:''I_MPI_MPIRUN=mpirun''MIC_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/mic/lib:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin_mic:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/lib/mic''CPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/ipp/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb/include:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/include''PAGER=less''MAVEN_OPTS=-Xmx2g -XX:ReservedCodeCacheSize=512m''_PHI_ROOT=/home/james/phi''LSCOLORS=Gxfxcxdxbxegedabagacad''_INTEL_SOURCE_ME=yes''NLSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2017.3.191/linux/mkl/lib/intel64_lin/locale/%l_%t/%N''MAIL=/var/spool/mail/james''PATH=/opt/intel/vtune_amplifier_xe_2017.0.2.478468/bin64:/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/bin:/home/james/clone/benchmark_spark/spark/bin:/home/james/maven/apache-maven-3.3.9/bin:/home/james/java/jdk1.8.0_111/bin:/home/james/bin:/home/james/local/bin:/home/james/.fzf:/home/james/phi/src/python:/home/james/phi/src/sh:/home/james/phi/build/bin:/home/james/phi/modules/linux/custom/custom_scripts:/opt/intel/compilers_and_libraries_2017.3.191/linux/bin/intel64:/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin:/usr/local/bin:/usr/bin:/home/james/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/home/james/.rvm/bin:/home/james/gopath/bin:/home/james/.fzf/bin''FZF_COMPLETION_TRIGGER=##''TBBROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/tbb''I_MPI_HYDRA_DEBUG=on''PHI=/home/james/phi''PWD=/home/james''JAVA_HOME=/home/james/java/jdk1.8.0_111''EDITOR=vim''HADOOP_CONF_DIR=/home/james/clone/benchmark_spark/dist/hadoop-3.0.0-alpha1/etc/hadoop''KERN=/home/james/phi/modules/linux''LANG=en_CA.UTF-8''NODE_PATH=/home/james/.jsctags/lib/jsctags/:''SELINUX_LEVEL_REQUESTED=''DAALROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/daal''HISTCONTROL=ignoredups''MOD=/home/james/phi/src/mpss/mpss-modules-srpm-3.8.1''C_MNT=/home/james/phi/modules/linux/custom/kvm/centos_root''SHLVL=2''HOME=/home/james''GOROOT=/home/james/golang''I_MPI_DEBUG=6''PYTHONPATH=.:/home/james/python:/home/james/.vim/src/python:/home/james/phi/build/local/lib/python2.7/site-packages:/home/james/phi/src/python''LESS=-R''LOGNAME=james''CLASSPATH=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2017.3.191/linux/daal/lib/daal.jar''SSH_CONNECTION=10.70.2.94 46738 10.70.2.83 22''LC_CTYPE=en_CA.UTF-8''GOPATH=/home/james/gopath''LESSOPEN=||/usr/bin/lesspipe.sh %s''_PHI_MPSS_SRC=/home/james/phi/src/mpss''CMAKE_PREFIX_PATH=/home/james/phi/build:''XDG_RUNTIME_DIR=/run/user/1000''I_MPI_ROOT=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi''_=/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/mpiexec.hydra' --global-user-env 0 --global-system-env 4 'MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1''GFORTRAN_UNBUFFERED_PRECONNECTED=y''I_MPI_HYDRA_UUID=bc3e0000-f374-a462-3551-050001dec0a8''DAPL_NETWORK_PROCESS_NUM=2' --proxy-core-count 1 --mpi-cmd-env mpirun -ppn 1 -host 192.168.1.111,192.168.1.222 -np 2 /home/james/helloMPI.XEON  --exec --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/james --exec-args 1 /home/james/helloMPI.XEON

[mpiexec@xen2] Launch arguments: /usr/bin/ssh -x -q 192.168.1.111 sh -c 'export I_MPI_ROOT="/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi" ; export PATH="/opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/:${I_MPI_ROOT}/intel64/bin:${PATH}" ; exec "$0""$@"' pmi_proxy --control-port 192.168.1.222:36282 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --enable-mic --i_mpi_base_path /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/ --i_mpi_base_arch 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 629081922 --usize -2 --proxy-id 0
[mpiexec@xen2] Launch arguments: pmi_proxy --control-port 192.168.1.222:36282 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --enable-mic --i_mpi_base_path /opt/intel/compilers_and_libraries_2017.3.191/linux/mpi/intel64/bin/ --i_mpi_base_arch 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 629081922 --usize -2 --proxy-id 1
[proxy:0:1@xen2] Start PMI_proxy 1
[proxy:0:1@xen2] got pmi command (from 9): init
pmi_version=1 pmi_subversion=1
[proxy:0:1@xen2] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:1@xen2] got pmi command (from 9): get_maxes

[proxy:0:1@xen2] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:1@xen2] got pmi command (from 9): barrier_in

[proxy:0:1@xen2] forwarding command (cmd=barrier_in) upstream
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[proxy:0:0@xen1] Start PMI_proxy 0
[proxy:0:0@xen1] STDIN will be redirected to 1 fd(s): 9
[proxy:0:0@xen1] got pmi command (from 6): init
pmi_version=1 pmi_subversion=1
[proxy:0:0@xen1] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@xen1] got pmi command (from 6): get_maxes

[proxy:0:0@xen1] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] PMI response to fd 11 pid 6: cmd=barrier_out
[mpiexec@xen2] PMI response to fd 8 pid 6: cmd=barrier_out
[proxy:0:0@xen1] got pmi command (from 6): barrier_in

[proxy:0:0@xen1] forwarding command (cmd=barrier_in) upstream
[proxy:0:1@xen2] PMI response: cmd=barrier_out
[proxy:0:1@xen2] got pmi command (from 9): get_ranks2hosts

[proxy:0:1@xen2] PMI response: put_ranks2hosts 42 2
13 192.168.1.111 0, 13 192.168.1.222 1,
[proxy:0:0@xen1] PMI response: cmd=barrier_out
[proxy:0:1@xen2] got pmi command (from 9): get_appnum

[proxy:0:1@xen2] PMI response: cmd=appnum appnum=0
[proxy:0:0@xen1] got pmi command (from 6): get_ranks2hosts

[proxy:0:0@xen1] PMI response: put_ranks2hosts 42 2
13 192.168.1.111 0, 13 192.168.1.222 1,
[proxy:0:1@xen2] got pmi command (from 9): get_my_kvsname

[proxy:0:1@xen2] PMI response: cmd=my_kvsname kvsname=kvs_16060_0
[proxy:0:0@xen1] got pmi command (from 6): get_appnum

[proxy:0:0@xen1] PMI response: cmd=appnum appnum=0
[proxy:0:1@xen2] got pmi command (from 9): get_my_kvsname

[proxy:0:0@xen1] got pmi command (from 6): get_my_kvsname

[proxy:0:1@xen2] PMI response: cmd=my_kvsname kvsname=kvs_16060_0
[proxy:0:0@xen1] PMI response: cmd=my_kvsname kvsname=kvs_16060_0
[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 2  Build 20170125 (id: 16752)
[proxy:0:0@xen1] got pmi command (from 6): get_my_kvsname

[proxy:0:0@xen1] PMI response: cmd=my_kvsname kvsname=kvs_16060_0
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[proxy:0:1@xen2] got pmi command (from 9): barrier_in

[proxy:0:1@xen2] forwarding command (cmd=barrier_in) upstream
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec@xen2] PMI response to fd 11 pid 6: cmd=barrier_out
[mpiexec@xen2] PMI response to fd 8 pid 6: cmd=barrier_out
[proxy:0:0@xen1] got pmi command (from 6): barrier_in

[proxy:0:0@xen1] forwarding command (cmd=barrier_in) upstream
[proxy:0:1@xen2] PMI response: cmd=barrier_out
[proxy:0:0@xen1] PMI response: cmd=barrier_out
[1] MPI startup(): Found 1 IB devices
[1] MPI startup(): Open 0 IB device: mlx4_0
[proxy:0:1@xen2] got pmi command (from 9): put
kvsname=kvs_16060_0 key=OFA_Init_fail value=1
[proxy:0:1@xen2] PMI response: cmd=put_result rc=0 msg=success
[proxy:0:1@xen2] forwarding command (cmd=put kvsname=kvs_16060_0 key=OFA_Init_fail value=1) upstream
[mpiexec@xen2] [pgid: 0] got PMI command: cmd=put kvsname=kvs_16060_0 key=OFA_Init_fail value=1
[1] MPI startup(): ofa fabric is not available and fallback fabric is not enabled

I don't have access to the Intel MPI source code, so sadly I cannot figure out where exactly these errors are manifesting.

Let me know if there's any other information I can provide.
Cheers,
James

Thread Topic: 

Help Me

Error in script creating wrappers for PGI 16.9 using Intel MPI

$
0
0

Hello,

I try to create the wrappers for PGI 16.9 compilers for Intel MPI 16.3.210

I added a comment in a quite similar topic but It seems not to be updated.

 

I'm working on a node running CentOS 7.2

uname -a
Linux jaws.cluster 3.10.0-327.36.2.el7.x86_64 #1 SMP Mon Oct 10 23:08:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

 

I check the version of the softwares I use :

# pgc++ --version
pgc++ 16.9-0 64-bit target on x86-64 Linux -tp haswell
The Portland Group - PGI Compilers and Tools
Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.

# mpirun --version
Intel(R) MPI Library for Linux* OS, Version 5.1.3 Build 20160120 (build id: 14053)
Copyright (C) 2003-2016, Intel Corporation. All rights reserved.

 

I follow the instructions in the file named README-intel-mpi-binding-kit.txt

# cd cxx
# make MPI_INST=/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi CC=pgc++  NAME=pgc++ ARCH=intel64
mkdir -p intel64/lib && ar cr intel64/lib/libmpipgc++.a initcxx.o
gcc  -shared -Xlinker -x -Xlinker -soname=libmpipgc++.so.12 -o intel64/lib/libmpipgc++.so.12.0 initcxx.o
(cd intel64/lib && if [ ! -f libmpipgc++.so.12 ]; then ln -s libmpipgc++.so.12.0 libmpipgc++.so.12; fi)
(cd intel64/lib && if [ ! -f libmpipgc++.so ]; then ln -s libmpipgc++.so.12 libmpipgc++.so; fi)
mkdir -p intel64/bin
sed -e 's/g++//trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgc++/' -e 's/Docompchk\=yes/Docompchk\=no/' -e 's/mpigc$gver/mpipgc++/g' \
-e 's/rpath_opt\=.*/rpath_opt\=/' /trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi/intel64/bin/mpigxx > intel64/bin/mpipgc++
sed: -e expression #1, char 8: unknown option to `s'
make: *** [makedriver] Error 1
# cd ../f77
# make MPI_INST=/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi CC=pgf77  NAME=pgf77 ARCH=intel64
mkdir -p intel64/bin
sed -e 's/g77//trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgf77/' -e 's/Docompchk\=yes/Docompchk\=no/' \
-e 's/rpath_opt\=.*/rpath_opt\=/' /trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi/intel64/bin/mpif77 > intel64/bin/mpipgf77
sed: -e expression #1, char 8: unknown option to `s'
make: *** [makedriver] Error 1
# cd ../f90
# make MPI_INST=/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi CC=pgf90  NAME=pgf90 ARCH=intel64
/trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgf90  -c -fPIC -I/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi/intel64/include mpi_constants.f90
/trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgf90  -c -fPIC -I/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi/intel64/include mpi_sizeofs.f90
/trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgf90  -c -fPIC -I/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi/intel64/include mpi_base.f90
/trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgf90  -c -fPIC -I/trinity/shared/apps/cv-standard/intel/2016.3.210/compilers_and_libraries_2016.3.210/linux/mpi/intel64/include mpi.f90
mkdir -p intel64/lib && ar cr intel64/lib/libmpipgf90.a mpi.o mpi_base.o mpi_sizeofs.o mpi_constants.o
mkdir -p intel64/include/pgf90 && mv *.mod intel64/include/pgf90
/trinity/shared/apps/cv-standard/pgi/linux86-64/16.9/bin/pgf90  -shared -Xlinker -x -Xlinker -soname=libmpipgf90.so -o intel64/lib/libmpipgf90.so mpi.o mpi_base.o mpi_sizeofs.o mpi_constants.o -lrt -ldl
pgf90-Error-Unknown switch: -Xlinker
pgf90-Error-Unknown switch: -x
pgf90-Error-Unknown switch: -Xlinker
make: *** [makemod] Error 1

 

 

 

It works fine in the "c" directory.

Could you please tell me what's wrong with the three others ?

Thank you in advance

Regards,

   Guy.

execvp error (Parallel running between 2 nodes)

$
0
0

Hi 

I'm trying to run a simple parallel code between 2 nodes (VS 2013 and intel cluster 2017). I have successfully run in parallel between inner cores in the node 0. I create host.txt file then copy the main folder in to the node1 desktop. Both folders in each node are shared.But when i run it between 2 nodes this error appear. The cmd code that i use is:

mpiexec -n 2 -ppn 1 -f host.txt c.exe

which one process runs in node 0 and another is in node1.

I attached host file and error figure.Thanks.

 

AttachmentSize
Downloadtext/plainhost.txt24 bytes
Downloadimage/jpegerror.jpg52.97 KB

Zone: 

MPI_File_read_all MPI_File_write_all local size limit

$
0
0

Dear Intel support team,

I have problem with MPI_File_read_all MPI_File_rwrite_all subroutines. I have a fortran code that should read large binary file (~2TB). In this file are few 2D matrices. The largest matrix has size ~0.5TB. I read this file using MPI IO soubrutines something like this:

          call MPI_TYPE_CREATE_SUBARRAY(2,dim,loc_sizes,loc_starts,MPI_ORDER_FORTRAN,MPI_DOUBLE_PRECISION,my_subarray,ierr)
          call MPI_Type_commit(my_subarray,ierr)
          call MPI_File_set_view(filehandle, disp,MPI_DOUBLE_PRECISION,my_subarray, &
                         "native",MPI_INFO_NULL, ierr)

          call MPI_File_read_all(filehandle, float2d, loc_sizes(1)*loc_sizes(2),MPI_DOUBLE_PRECISION,status, ierr)

The problem occurs in MPI_File_read_all call. The number of elements in each submatrices loc_sizes(1)*loc_sizes(2) multiply by the matrix type (8 bytes in Double precision) can not be larger than Integer allowed number 2147483647 (~2GB). In my case each submatrices will have  more than 10-20 GB. I tried instead of using integer*4 to use integer*8 but it did not help as MPI subroutine I think transform it again to integer*4. Is there any solution of this problem as you did for example in  MPI_File_set_view where displacment type was changed from integer to INTEGER(KIND=MPI_OFFSET_KIND), INTENT(IN) :: disp. The program works fine if the submatrix size is smaller than 2147483647 bytes.

Here is the error message that I got:

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
libifcore.so.5     00002ADA8C450876  for__signal_handl     Unknown  Unknown
libc-2.17.so       00002ADA928C8670  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA91AAEB06  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA91AAF780  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA91AA3039  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA91AA49E4  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA91727370  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA919A1C00  Unknown               Unknown  Unknown
libmpi.so.12.0     00002ADA91971B90  Unknown               Unknown  Unknown
libmpi.so.12       00002ADA9193EFF8  MPI_Isend             Unknown  Unknown
libmpi.so.12.0     00002ADA91695A61  Unknown               Unknown  Unknown
libmpi.so.12       00002ADA916943B8  ADIOI_GEN_ReadStr     Unknown  Unknown
libmpi.so.12       00002ADA91A6DDF5  PMPI_File_read_al     Unknown  Unknown
libmpifort.so.12.  00002ADA912AB4CB  mpi_file_read_all     Unknown  Unknown
jorek_model199     000000000044E747  vacuum_response_m         519  vacuum_response.f90
jorek_model199     000000000044B770  vacuum_response_m         986  vacuum_response.f90
jorek_model199     000000000044A6F4  vacuum_response_m          90  vacuum_response.f90
jorek_model199     000000000041134E  MAIN__                    486  jorek2_main.f90
jorek_model199     000000000040C95E  Unknown               Unknown  Unknown
libc-2.17.so       00002ADA928B4B15  __libc_start_main     Unknown  Unknown

 

Thank you in advance,

Mochalskyy Serhiy

 

Thread Topic: 

Bug Report

Intel MPI unable to use 'ofa' fabric with Mellanox OFED on ConnectX-4 Lx EN ethernet cards

$
0
0

I have a system where I am unable to get Intel MPI to use the 'ofa' fabric with Mellanox OFED over ConnectX-4 Lx EN ethernet cards and have exhausted all means I know of to get it to work. I'd appreciate any input to help me to get this working.

Relevant info:

  • Operating System is CentOS 7.3
  • Intel(R) MPI Library for Linux* OS, Version 2017 Update 3 Build 20170405 (id: 17193)
  • NIC is a single Mellanox ConnectX-4 Lx EN ethernet-only card (RoCE v1 and v2 supported) with 2x25Gb ports
    •  lspci | grep Mell
      05:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
      05:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
    •  ibv_devinfo
      hca_id:	mlx5_1
      	transport:			InfiniBand (0)
      	fw_ver:				14.18.2000
      	node_guid:			248a:0703:008d:6237
      	sys_image_guid:			248a:0703:008d:6236
      	vendor_id:			0x02c9
      	vendor_part_id:			4117
      	hw_ver:				0x0
      	board_id:			MT_2420110034
      	phys_port_cnt:			1
      	Device ports:
      		port:	1
      			state:			PORT_ACTIVE (4)
      			max_mtu:		4096 (5)
      			active_mtu:		1024 (3)
      			sm_lid:			0
      			port_lid:		0
      			port_lmc:		0x00
      			link_layer:		Ethernet
      
      hca_id:	mlx5_0
      	transport:			InfiniBand (0)
      	fw_ver:				14.18.2000
      	node_guid:			248a:0703:008d:6236
      	sys_image_guid:			248a:0703:008d:6236
      	vendor_id:			0x02c9
      	vendor_part_id:			4117
      	hw_ver:				0x0
      	board_id:			MT_2420110034
      	phys_port_cnt:			1
      	Device ports:
      		port:	1
      			state:			PORT_ACTIVE (4)
      			max_mtu:		4096 (5)
      			active_mtu:		1024 (3)
      			sm_lid:			0
      			port_lid:		0
      			port_lmc:		0x00
      			link_layer:		Ethernet
  • Mellanox OFED driver, output of 'ofed_info' is:
    • MLNX_OFED_LINUX-4.0-2.0.0.1 (OFED-4.0-2.0.0):
    • Verified RDMA is working via perftest utilities, e.g. ib_write_bw, ib_write_lat, etc.
  • Specified that ofa fabric should be used via the environment variable I_MPI_FABRICS=shm:ofa
    • Verified that using 'tcp' and 'dapl' fabrics works (had to manually add RoCE entries to the dat.conf to get dapl to work)
    • $ env | grep I_MPI
      I_MPI_FABRICS=shm:ofa
      I_MPI_HYDRA_DEBUG=on
      I_MPI_DEBUG=6
      I_MPI_ROOT=/opt/intel/psxe_runtime_2017.4.196/linux/mpi
  • Output when I try and run my code when 'ofa' is specified as the fabric for Intel MPI:
    • mpirun -n 1 ./server
      host: sp1.muskrat.local
      
      ==================================================================================================
      mpiexec options:
      ----------------
        Base path: /opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin/
        Launcher: ssh
        Debug level: 1
        Enable X: -1
      
        Global environment:
        -------------------
          I_MPI_PERHOST=allcores
          LD_LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/daal/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/intel64/gcc4.1:/opt/intel/psxe_runtime_2017.4.196/linux/ipp/lib/intel64:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib
          MKLROOT=/opt/intel/psxe_runtime_2017.4.196/linux/mkl
          MANPATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/man:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/man:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man
          I_MPI_DEBUG_HYDRA=0
          XDG_SESSION_ID=216
          HOSTNAME=sp1.muskrat.local
          SELINUX_ROLE_REQUESTED=
          IPPROOT=/opt/intel/psxe_runtime_2017.4.196/linux/ipp
          SHELL=/bin/bash
          TERM=xterm-256color
          HISTSIZE=1000
          I_MPI_FABRICS=shm:ofa
          SSH_CLIENT=192.168.1.2 33072 22
          LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/daal/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/intel64/gcc4.1:/opt/intel/psxe_runtime_2017.4.196/linux/ipp/lib/intel64:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin
          SELINUX_USE_CURRENT_RANGE=
          SSH_TTY=/dev/pts/1
          MIC_LD_LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/mic:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic
          USER=jrhemst
          LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:
          I_MPI_MPIRUN=mpirun
          MIC_LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/mic
          CPATH=/opt/intel/psxe_runtime_2017.4.196/linux/daal/include:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/include:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/include:/opt/intel/psxe_runtime_2017.4.196/linux/ipp/include:
          NLSPATH=/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin/locale/%l_%t/%N:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin/locale/%l_%t/%N:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin/locale/%l_%t/%N
          PATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin:/opt/intel/psxe_runtime_2017.4.196/linux/bin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin:/opt/intel/psxe_runtime_2017.4.196/linux/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/opt/puppetlabs/bin:/home/jrhemst/.local/bin:/home/jrhemst/bin
          MAIL=/var/spool/mail/jrhemst
          TBBROOT=/opt/intel/psxe_runtime_2017.4.196/linux/tbb
          PWD=/home/jrhemst/mpi_test
          I_MPI_HYDRA_DEBUG=on
          XMODIFIERS=@im=none
          EDITOR=vim
          LANG=en_US.UTF-8
          MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
          LOADEDMODULES=
          SELINUX_LEVEL_REQUESTED=
          DAALROOT=/opt/intel/psxe_runtime_2017.4.196/linux/daal
          HISTCONTROL=ignoredups
          HOME=/home/jrhemst
          SHLVL=2
          I_MPI_DEBUG=6
          LOGNAME=jrhemst
          SSH_CONNECTION=192.168.1.2 33072 192.168.1.113 22
          CLASSPATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib/mpi.jar:/opt/intel/psxe_runtime_2017.4.196/linux/daal/lib/daal.jar:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib/mpi.jar:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib/mpi.jar
          MODULESHOME=/usr/share/Modules
          LESSOPEN=||/usr/bin/lesspipe.sh %s
          XDG_RUNTIME_DIR=/run/user/1338400006
          I_MPI_ROOT=/opt/intel/psxe_runtime_2017.4.196/linux/mpi
          BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`
      }
          _=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin/mpiexec.hydra
      
        Hydra internal environment:
        ---------------------------
          MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1
          GFORTRAN_UNBUFFERED_PRECONNECTED=y
          I_MPI_HYDRA_UUID=7e590000-3bc9-185f-6852-05000171c0a8
          DAPL_NETWORK_PROCESS_NUM=1
      
        Intel(R) MPI Library specific variables:
        ----------------------------------------
          I_MPI_PERHOST=allcores
          I_MPI_DEBUG_HYDRA=0
          I_MPI_FABRICS=shm:ofa
          I_MPI_MPIRUN=mpirun
          I_MPI_HYDRA_DEBUG=on
          I_MPI_DEBUG=6
          I_MPI_ROOT=/opt/intel/psxe_runtime_2017.4.196/linux/mpi
          I_MPI_HYDRA_UUID=7e590000-3bc9-185f-6852-05000171c0a8
      
      
          Proxy information:
          *********************
            [1] proxy: sp1.muskrat.local (16 cores)
            Exec list: ./server (1 processes);
      
      
      ==================================================================================================
      
      [mpiexec@sp1.muskrat.local] Timeout set to -1 (-1 means infinite)
      [mpiexec@sp1.muskrat.local] Got a control port string of sp1.muskrat.local:33538
      
      Proxy launch args: /opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin/pmi_proxy --control-port sp1.muskrat.local:33538 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 1079489000 --usize -2 --proxy-id
      
      Arguments being passed to proxy 0:
      --version 3.2 --iface-ip-env-name MPIR_CVAR_CH3_INTERFACE_HOSTNAME --hostname sp1.muskrat.local --global-core-map 0,16,16 --pmi-id-map 0,0 --global-process-count 1 --auto-cleanup 1 --pmi-kvsname kvs_22910_0 --pmi-process-mapping (vector,(0,1,16)) --topolib ipl --ckpointlib blcr --ckpoint-prefix /tmp --ckpoint-preserve 1 --ckpoint off --ckpoint-num -1 --global-inherited-env 49 'I_MPI_PERHOST=allcores''LD_LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/daal/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/intel64/gcc4.1:/opt/intel/psxe_runtime_2017.4.196/linux/ipp/lib/intel64:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib''MKLROOT=/opt/intel/psxe_runtime_2017.4.196/linux/mkl''MANPATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/man:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/man:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/man:/usr/local/share/man:/usr/share/man:/opt/ibutils/share/man''I_MPI_DEBUG_HYDRA=0''XDG_SESSION_ID=216''HOSTNAME=sp1.muskrat.local''SELINUX_ROLE_REQUESTED=''IPPROOT=/opt/intel/psxe_runtime_2017.4.196/linux/ipp''SHELL=/bin/bash''TERM=xterm-256color''HISTSIZE=1000''I_MPI_FABRICS=shm:ofa''SSH_CLIENT=192.168.1.2 33072 22''LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/daal/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/intel64/gcc4.1:/opt/intel/psxe_runtime_2017.4.196/linux/ipp/lib/intel64:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin''SELINUX_USE_CURRENT_RANGE=''SSH_TTY=/dev/pts/1''MIC_LD_LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/mic:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/mic/lib:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic''USER=jrhemst''LS_COLORS=rs=0:di=38;5;27:ln=38;5;51:mh=44;38;5;15:pi=40;38;5;11:so=38;5;13:do=38;5;5:bd=48;5;232;38;5;11:cd=48;5;232;38;5;3:or=48;5;232;38;5;9:mi=05;48;5;232;38;5;15:su=48;5;196;38;5;15:sg=48;5;11;38;5;16:ca=48;5;196;38;5;226:tw=48;5;10;38;5;16:ow=48;5;10;38;5;21:st=48;5;21;38;5;15:ex=38;5;34:*.tar=38;5;9:*.tgz=38;5;9:*.arc=38;5;9:*.arj=38;5;9:*.taz=38;5;9:*.lha=38;5;9:*.lz4=38;5;9:*.lzh=38;5;9:*.lzma=38;5;9:*.tlz=38;5;9:*.txz=38;5;9:*.tzo=38;5;9:*.t7z=38;5;9:*.zip=38;5;9:*.z=38;5;9:*.Z=38;5;9:*.dz=38;5;9:*.gz=38;5;9:*.lrz=38;5;9:*.lz=38;5;9:*.lzo=38;5;9:*.xz=38;5;9:*.bz2=38;5;9:*.bz=38;5;9:*.tbz=38;5;9:*.tbz2=38;5;9:*.tz=38;5;9:*.deb=38;5;9:*.rpm=38;5;9:*.jar=38;5;9:*.war=38;5;9:*.ear=38;5;9:*.sar=38;5;9:*.rar=38;5;9:*.alz=38;5;9:*.ace=38;5;9:*.zoo=38;5;9:*.cpio=38;5;9:*.7z=38;5;9:*.rz=38;5;9:*.cab=38;5;9:*.jpg=38;5;13:*.jpeg=38;5;13:*.gif=38;5;13:*.bmp=38;5;13:*.pbm=38;5;13:*.pgm=38;5;13:*.ppm=38;5;13:*.tga=38;5;13:*.xbm=38;5;13:*.xpm=38;5;13:*.tif=38;5;13:*.tiff=38;5;13:*.png=38;5;13:*.svg=38;5;13:*.svgz=38;5;13:*.mng=38;5;13:*.pcx=38;5;13:*.mov=38;5;13:*.mpg=38;5;13:*.mpeg=38;5;13:*.m2v=38;5;13:*.mkv=38;5;13:*.webm=38;5;13:*.ogm=38;5;13:*.mp4=38;5;13:*.m4v=38;5;13:*.mp4v=38;5;13:*.vob=38;5;13:*.qt=38;5;13:*.nuv=38;5;13:*.wmv=38;5;13:*.asf=38;5;13:*.rm=38;5;13:*.rmvb=38;5;13:*.flc=38;5;13:*.avi=38;5;13:*.fli=38;5;13:*.flv=38;5;13:*.gl=38;5;13:*.dl=38;5;13:*.xcf=38;5;13:*.xwd=38;5;13:*.yuv=38;5;13:*.cgm=38;5;13:*.emf=38;5;13:*.axv=38;5;13:*.anx=38;5;13:*.ogv=38;5;13:*.ogx=38;5;13:*.aac=38;5;45:*.au=38;5;45:*.flac=38;5;45:*.mid=38;5;45:*.midi=38;5;45:*.mka=38;5;45:*.mp3=38;5;45:*.mpc=38;5;45:*.ogg=38;5;45:*.ra=38;5;45:*.wav=38;5;45:*.axa=38;5;45:*.oga=38;5;45:*.spx=38;5;45:*.xspf=38;5;45:''I_MPI_MPIRUN=mpirun''MIC_LIBRARY_PATH=/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin_mic:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/lib/mic''CPATH=/opt/intel/psxe_runtime_2017.4.196/linux/daal/include:/opt/intel/psxe_runtime_2017.4.196/linux/mkl/include:/opt/intel/psxe_runtime_2017.4.196/linux/tbb/include:/opt/intel/psxe_runtime_2017.4.196/linux/ipp/include:''NLSPATH=/opt/intel/psxe_runtime_2017.4.196/linux/mkl/lib/intel64_lin/locale/%l_%t/%N:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin/locale/%l_%t/%N:/opt/intel/psxe_runtime_2017.4.196/linux/compiler/lib/intel64_lin/locale/%l_%t/%N''PATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin:/opt/intel/psxe_runtime_2017.4.196/linux/bin:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin:/opt/intel/psxe_runtime_2017.4.196/linux/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/opt/puppetlabs/bin:/home/jrhemst/.local/bin:/home/jrhemst/bin''MAIL=/var/spool/mail/jrhemst''TBBROOT=/opt/intel/psxe_runtime_2017.4.196/linux/tbb''PWD=/home/jrhemst/mpi_test''I_MPI_HYDRA_DEBUG=on''XMODIFIERS=@im=none''EDITOR=vim''LANG=en_US.UTF-8''MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles''LOADEDMODULES=''SELINUX_LEVEL_REQUESTED=''DAALROOT=/opt/intel/psxe_runtime_2017.4.196/linux/daal''HISTCONTROL=ignoredups''HOME=/home/jrhemst''SHLVL=2''I_MPI_DEBUG=6''LOGNAME=jrhemst''SSH_CONNECTION=192.168.1.2 33072 192.168.1.113 22''CLASSPATH=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib/mpi.jar:/opt/intel/psxe_runtime_2017.4.196/linux/daal/lib/daal.jar:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib/mpi.jar:/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/lib/mpi.jar''MODULESHOME=/usr/share/Modules''LESSOPEN=||/usr/bin/lesspipe.sh %s''XDG_RUNTIME_DIR=/run/user/1338400006''I_MPI_ROOT=/opt/intel/psxe_runtime_2017.4.196/linux/mpi''BASH_FUNC_module()=() {  eval `/usr/bin/modulecmd bash $*`
      }''_=/opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin/mpiexec.hydra' --global-user-env 0 --global-system-env 4 'MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1''GFORTRAN_UNBUFFERED_PRECONNECTED=y''I_MPI_HYDRA_UUID=7e590000-3bc9-185f-6852-05000171c0a8''DAPL_NETWORK_PROCESS_NUM=1' --proxy-core-count 16 --mpi-cmd-env mpirun -n 1 ./server  --exec --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/jrhemst/mpi_test --exec-args 1 ./server
      
      [mpiexec@sp1.muskrat.local] Launch arguments: /opt/intel/psxe_runtime_2017.4.196/linux/mpi/intel64/bin/pmi_proxy --control-port sp1.muskrat.local:33538 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --rmk user --launcher ssh --demux poll --pgid 0 --enable-stdin 1 --retries 10 --control-code 1079489000 --usize -2 --proxy-id 0
      [proxy:0:0@sp1.muskrat.local] Start PMI_proxy 0
      [proxy:0:0@sp1.muskrat.local] STDIN will be redirected to 1 fd(s): 17
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): init
      pmi_version=1 pmi_subversion=1
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): get_maxes
      
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): barrier_in
      
      [proxy:0:0@sp1.muskrat.local] forwarding command (cmd=barrier_in) upstream
      [mpiexec@sp1.muskrat.local] [pgid: 0] got PMI command: cmd=barrier_in
      [mpiexec@sp1.muskrat.local] PMI response to fd 8 pid 12: cmd=barrier_out
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=barrier_out
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): get_ranks2hosts
      
      [proxy:0:0@sp1.muskrat.local] PMI response: put_ranks2hosts 26 1
      17 sp1.muskrat.local 0,
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): get_appnum
      
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=appnum appnum=0
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): get_my_kvsname
      
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=my_kvsname kvsname=kvs_22910_0
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): get_my_kvsname
      
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=my_kvsname kvsname=kvs_22910_0
      [0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 3  Build 20170405 (id: 17193)
      [0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
      [0] MPI startup(): Multi-threaded optimized library
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): barrier_in
      
      [proxy:0:0@sp1.muskrat.local] forwarding command (cmd=barrier_in) upstream
      [mpiexec@sp1.muskrat.local] [pgid: 0] got PMI command: cmd=barrier_in
      [mpiexec@sp1.muskrat.local] PMI response to fd 8 pid 12: cmd=barrier_out
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=barrier_out
      [0] MPI startup(): Found 2 IB devices
      [0] MPI startup(): Open 0 IB device: mlx5_1
      [0] MPI startup(): Open 1 IB device: mlx5_0
      [proxy:0:0@sp1.muskrat.local] got pmi command (from 12): put
      kvsname=kvs_22910_0 key=OFA_Init_fail value=1
      [proxy:0:0@sp1.muskrat.local] PMI response: cmd=put_result rc=0 msg=success
      [proxy:0:0@sp1.muskrat.local] forwarding command (cmd=put kvsname=kvs_22910_0 key=OFA_Init_fail value=1) upstream
      [mpiexec@sp1.muskrat.local] [pgid: 0] got PMI command: cmd=put kvsname=kvs_22910_0 key=OFA_Init_fail value=1
      [0] MPI startup(): ofa fabric is not available and fallback fabric is not enabled
      

       

Zone: 

Thread Topic: 

Help Me
Viewing all 930 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>