problem with intel mpi 2019

February 13, 2018, 10:15 am

Latest and popular articles on Intel Technologies

≫ Next: How to get the exit code from mpiexec.hydra

≪ Previous: Performance issues with Omni Path

When I compile test program with last beta version of intel mpi_2019 I receive error. Anybody have same problem?

$ mpiicc -o test test.c
ld: warning: libfabric.so.1, needed by /common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so, not found (try using -rpath or -rpath-link)
/common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so: undefined reference to `fi_strerror@FABRIC_1.0'
/common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so: undefined reference to `fi_tostr@FABRIC_1.0'
/common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so: undefined reference to `fi_fabric@FABRIC_1.1'
/common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so: undefined reference to `fi_dupinfo@FABRIC_1.1'
/common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so: undefined reference to `fi_getinfo@FABRIC_1.1'
/common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/lib/release/libmpi.so: undefined reference to `fi_freeinfo@FABRIC_1.1'

$ type mpiicc
mpiicc is /common/intel/compilers_and_libraries_2018.1.163/linux/mpi_2019/intel64/bin/mpiicc

$ cat test.c
/*
Copyright 2003-2017 Intel Corporation. All Rights Reserved.

    The source code contained or described herein and all documents
    related to the source code ("Material") are owned by Intel Corporation
    or its suppliers or licensors. Title to the Material remains with
    Intel Corporation or its suppliers and licensors. The Material is
    protected by worldwide copyright and trade secret laws and treaty
    provisions. No part of the Material may be used, copied, reproduced,
    modified, published, uploaded, posted, transmitted, distributed, or
    disclosed in any way without Intel's prior express written permission.

    No license under any patent, copyright, trade secret or other
    intellectual property right is granted to or conferred upon you by
    disclosure or delivery of the Materials, either expressly, by
    implication, inducement, estoppel or otherwise. Any license under
    such intellectual property rights must be express and approved by
    Intel in writing.
*/
#include "mpi.h"
#include <stdio.h>
#include <string.h>

int
main (int argc, char *argv[])
{
    int i, rank, size, namelen;
    char name[MPI_MAX_PROCESSOR_NAME];
    MPI_Status stat;

MPI_Init (&argc, &argv);

    MPI_Comm_size (MPI_COMM_WORLD, &size);
    MPI_Comm_rank (MPI_COMM_WORLD, &rank);
    MPI_Get_processor_name (name, &namelen);

if (rank == 0) {

printf ("Hello world: rank %d of %d running on %s\n", rank, size, name);

        for (i = 1; i < size; i++) {
            MPI_Recv (&rank, 1, MPI_INT, i, 1, MPI_COMM_WORLD, &stat);
            MPI_Recv (&size, 1, MPI_INT, i, 1, MPI_COMM_WORLD, &stat);
            MPI_Recv (&namelen, 1, MPI_INT, i, 1, MPI_COMM_WORLD, &stat);
            MPI_Recv (name, namelen + 1, MPI_CHAR, i, 1, MPI_COMM_WORLD, &stat);
            printf ("Hello world: rank %d of %d running on %s\n", rank, size, name);
        }

} else {

        MPI_Send (&rank, 1, MPI_INT, 0, 1, MPI_COMM_WORLD);
        MPI_Send (&size, 1, MPI_INT, 0, 1, MPI_COMM_WORLD);
        MPI_Send (&namelen, 1, MPI_INT, 0, 1, MPI_COMM_WORLD);
        MPI_Send (name, namelen + 1, MPI_CHAR, 0, 1, MPI_COMM_WORLD);

}

MPI_Finalize ();

return (0);
}

↧

How to get the exit code from mpiexec.hydra

February 13, 2018, 11:14 am

Latest and popular articles on Intel Technologies

≫ Next: Python MPI4PY ISSUE

≪ Previous: problem with intel mpi 2019

When running a workload on multiple nodes with mpiexec.hydra, the entire run aborts when even one node fails/shutsdown. I want to detect if the failure is due to node disconnection or something else. Trying to print out the exit code with "-print-all-exitcodes" does't seem to work

Is there any other option?

↧

Python MPI4PY ISSUE

February 16, 2018, 9:51 pm

Latest and popular articles on Intel Technologies

≫ Next: OPA driver for Skylake running Ubuntu 16

≪ Previous: How to get the exit code from mpiexec.hydra

Hi All,

I'm HPC Admin. I have installed MPI4PY Library on clusters by .tar and Pip2.7(python2.7). After that, we are facing the issue like a 256cores job (n=4,ppn=64) is not running on nodes. It happened after installing MPI4PY(3.0). normal python code is running.

Users unable to run jobs on Cluster like VASP,MPI4PY, mpi, openmpi, etc.

The error is Given below:

[cli_0]: aborting job:

Fatal error in MPI_Init:

Other MPI error

[mpiexec@tyrone-node16] HYD_pmcd_pmiserv_send_signal (./pm/pmiserv/pmiserv_cb.c:184): assert (!closed) failed

[mpiexec@tyrone-node16] ui_cmd_cb (./pm/pmiserv/pmiserv_pmci.c:74): unable to send SIGUSR1 downstream

[mpiexec@tyrone-node16] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status

[mpiexec@tyrone-node16] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event

[mpiexec@tyrone-node16] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completions

Kindly Help me.

Thanks in Advance!

Rahul Akolkar

↧

OPA driver for Skylake running Ubuntu 16

February 19, 2018, 9:23 am

Latest and popular articles on Intel Technologies

≫ Next: ITAC error during trace collection

≪ Previous: Python MPI4PY ISSUE

Can you please refer me to the BKMs on getting Intel Omni-Path Fabric (OPA) driver installed and setup on Ubuntu 16? I have a Skylake processor (Gold 6148F CPU @ 2.40GHz).

Thanks,

Dave

↧

ITAC error during trace collection

February 25, 2018, 2:28 am

Latest and popular articles on Intel Technologies

≫ Next: Using Intel Trace analyzer with windows and VS2015

≪ Previous: OPA driver for Skylake running Ubuntu 16

Hi all,

I am trying to collect tracing info for my Intel MPI job. For relatively small number of processes (around 300) the run hangs or I receive the following error message:

UCM connect: REQ RETRIES EXHAUSTED: 0x570 32c43 0xed -> 0x544 3f3a4 0xbbdd

How can I debug this error?

Best,

Igor

↧

Using Intel Trace analyzer with windows and VS2015

March 1, 2018, 5:54 pm

Latest and popular articles on Intel Technologies

≫ Next: Pinning processes to specific cores?

≪ Previous: ITAC error during trace collection

Hi,

I've been directed here from the Intel® Visual Fortran Compiler for Windows forum

I'm trying to use Intel Trace analyzer with windows and VS2015 on my CFD code.

In my code, I have many F90 files but different modules and subroutines. Also, I'm linking with other libraries such as PETSc.

I tried to add "/trace" in the additional commands in VS2015 GUI and compile. However, after running my code, there is no *.stf generated.

I then tried to do it in cygwin, on another smaller code, by adding -trace to the compiling and building. Similarly, no *.stf is generated.

I also tried to compile and build directory in the command prompt:

mpiifort /c /MT /Z7 /fpp /Ic:\wtay\Lib\petsc-3.8.3_win64_impi_vs2015_debug\include /trace /o ex2f.obj ex2f.F

mpiifort /MT /trace /o ex2f.exe ex2f.obj /INCREMENTAL:NO /NOLOGO /qnoipo /LIBPATH:"C:\wtay\Lib\petsc-3.8.3_win64_impi_vs2015_debug\lib" Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib impi.lib impid.lib impicxx.lib impicxxd.lib libpetsc.lib libflapack.lib libfblas.lib kernel32.lib

Now I can get the *.stf file.

But is there some way to do it in VS2015? As mentioned, I have many F90 files and I hope I do not have to use the command line.

Thanks.

↧

Pinning processes to specific cores?

March 5, 2018, 4:37 pm

Latest and popular articles on Intel Technologies

≫ Next: Cross Platform MPI start failed

≪ Previous: Using Intel Trace analyzer with windows and VS2015

I'm wondering if Intel MPI has the facility to allow me to pin processes, not only to specific nodes within my cluster, but to specific cores within those nodes. With Open MPI, I can set up a rankfile that will give me this fine-grained capability, that is, I can assign each MPI process rank to a specific node and a given core on that node (logical or physical).

Granted, the rankfile idea from Open MPI is merely theoretical since the OS I have on the machines doesn't seem to abide by the assignments I make, but at least the possibility is there. I haven't found that level of control within the Intel MPI documentation. Any pointers or is this all just a pipe dream on my part?

↧

Cross Platform MPI start failed

March 5, 2018, 10:56 pm

Latest and popular articles on Intel Technologies

≫ Next: Unable to generate trace file (*.stf)

≪ Previous: Pinning processes to specific cores?

Hi Intel Engineers,

I met some problems when setting up a cross platform MPI environment.

Following the Intelmpi-2018-developer-guide-linux/windows，two machines were setted up, one is CentOS and anathor is windows server 2000.

The machine for Centos is 'mpihost1', and for Windows it is ‘iriphost1’.

SSH was configured correct. Input 'ssh root@mpihost1' could connected from windows to linux successfully.

However, when using command 'mpiexec -d -bootstrap ssh -hostos linux -host mpihost1 -n 1 hostname', an error 'bash: pmi_proxy: command not found' occurrd.

Is there any suggections?

Thanks

zhongqi

Here is the debug info:

C:\Windows\system32>mpiexec -d -bootstrap ssh -hostos linux -host mpihost1 -n 1 hostname
host: mpihost1

==================================================================================================
mpiexec options:
----------------
Base path: C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi\intel64\b
in\
Launcher: ssh
Debug level: 1
Enable X: -1

Global environment:
-------------------
    ALLUSERSPROFILE=C:\ProgramData
    APPDATA=C:\Users\root\AppData\Roaming
    CLIENTNAME=D1301002443
    CommonProgramFiles=C:\Program Files\Common Files
    CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files
    CommonProgramW6432=C:\Program Files\Common Files
    COMPUTERNAME=IRIPHOST1
    ComSpec=C:\Windows\system32\cmd.exe
    CYGWIN=tty
    FP_NO_HOST_CHECK=NO
    HOME=F:\zzq\home\
    HOMEDRIVE=C:
    HOMEPATH=\Users\root
    INTEL_LICENSE_FILE=C:\Program Files (x86)\Common Files\Intel\Licenses
    I_MPI_ROOT=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi
    LOCALAPPDATA=C:\Users\root\AppData\Local
    LOGONSERVER=\\IRIPHOST1
    NUMBER_OF_PROCESSORS=4
    OS=Windows_NT
    Path=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi\intel64\bin;C
:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;E:\lxx
\scs;C:\Program Files\MySQL\MySQL Server 5.7\bin;C:\Program Files (x86)\Git\cmd;C:\Program Files (x86)\Gi
tExtensions\;F:\zzq\mpi\MinGW\msys\1.0\bin
    PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
    PROCESSOR_ARCHITECTURE=AMD64
    PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
    PROCESSOR_LEVEL=6
    PROCESSOR_REVISION=2a07
    ProgramData=C:\ProgramData
    ProgramFiles=C:\Program Files
    ProgramFiles(x86)=C:\Program Files (x86)
    ProgramW6432=C:\Program Files
    PROMPT=$P$G
    PSModulePath=C:\Windows\system32\WindowsPowerShell\v1.0\Modules\
    PUBLIC=C:\Users\Public
    SESSIONNAME=RDP-Tcp#0
    SystemDrive=C:
    SystemRoot=C:\Windows
    TEMP=C:\Users\root\AppData\Local\Temp\2
    TMP=C:\Users\root\AppData\Local\Temp\2
    USERDOMAIN=IRIPHOST1
    USERNAME=root
    USERPROFILE=C:\Users\root
    windir=C:\Windows

Hydra internal environment:
---------------------------
    MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1
    GFORTRAN_UNBUFFERED_PRECONNECTED=y
    I_MPI_HYDRA_UUID=af00a0000-d366f2d3-34f29834-8a985d8a-

Intel(R) MPI Library specific variables:
----------------------------------------
I_MPI_ROOT=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi
I_MPI_HYDRA_UUID=af00a0000-d366f2d3-34f29834-8a985d8a-

    Proxy information:
    *********************
      [1] proxy: mpihost1 (1 cores)
      Exec list: hostname (1 processes);

==================================================================================================

[mpiexec@iriphost1] Timeout set to -1 (-1 means infinite)
[mpiexec@iriphost1] Got a control port string of iriphost1:60519

Proxy launch args: C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi\int
el64\bin\pmi_proxy --control-port iriphost1:60519 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --r
mk user --launcher ssh --demux select --pgid 0 --enable-stdin 1 --retries 10 --control-code 9182 --usize
-2 --proxy-id

Arguments being passed to proxy 0:
--version 3.2 --iface-ip-env-name MPIR_CVAR_CH3_INTERFACE_HOSTNAME --hostname mpihost1 --global-core-map
0,1,1 --pmi-id-map 0,0 --global-process-count 1 --auto-cleanup 1 --pmi-kvsname kvs_2800_0 --pmi-process-m
apping (vector,(0,1,1)) --topolib ipl --ckpointlib blcr --ckpoint-prefix /tmp --ckpoint-preserve 1 --ckpo
int off --ckpoint-num -1 --global-inherited-env 41 'ALLUSERSPROFILE=C:\ProgramData''APPDATA=C:\Users\roo
t\AppData\Roaming''CLIENTNAME=D1301002443''CommonProgramFiles=C:\Program Files\Common Files''CommonPro
gramFiles(x86)=C:\Program Files (x86)\Common Files''CommonProgramW6432=C:\Program Files\Common Files''C
OMPUTERNAME=IRIPHOST1''ComSpec=C:\Windows\system32\cmd.exe''CYGWIN=tty''FP_NO_HOST_CHECK=NO''HOME=F:\
zzq\home\''HOMEDRIVE=C:''HOMEPATH=\Users\root''INTEL_LICENSE_FILE=C:\Program Files (x86)\Common Files\
Intel\Licenses''I_MPI_ROOT=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\window
s\mpi''LOCALAPPDATA=C:\Users\root\AppData\Local''LOGONSERVER=\\IRIPHOST1''NUMBER_OF_PROCESSORS=4''OS=
Windows_NT''Path=C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi\inte
l64\bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.
0\;E:\lxx\scs;C:\Program Files\MySQL\MySQL Server 5.7\bin;C:\Program Files (x86)\Git\cmd;C:\Program Files
(x86)\GitExtensions\;F:\zzq\mpi\MinGW\msys\1.0\bin''PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF
;.WSH;.MSC''PROCESSOR_ARCHITECTURE=AMD64''PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 42 Stepping 7, Ge
nuineIntel''PROCESSOR_LEVEL=6''PROCESSOR_REVISION=2a07''ProgramData=C:\ProgramData''ProgramFiles=C:\P
rogram Files''ProgramFiles(x86)=C:\Program Files (x86)''ProgramW6432=C:\Program Files''PROMPT=$P$G''P
SModulePath=C:\Windows\system32\WindowsPowerShell\v1.0\Modules\''PUBLIC=C:\Users\Public''SESSIONNAME=RD
P-Tcp#0''SystemDrive=C:''SystemRoot=C:\Windows''TEMP=C:\Users\root\AppData\Local\Temp\2''TMP=C:\Users
\root\AppData\Local\Temp\2''USERDOMAIN=IRIPHOST1''USERNAME=root''USERPROFILE=C:\Users\root''windir=C:
\Windows' --global-user-env 0 --global-system-env 3 'MPIR_CVAR_NEMESIS_ENABLE_CKPOINT=1''GFORTRAN_UNBUFF
ERED_PRECONNECTED=y''I_MPI_HYDRA_UUID=af00a0000-d366f2d3-34f29834-8a985d8a-' --proxy-core-count 1 --mpi-
cmd-env C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018.1.156\windows\mpi\intel64\bin\mp
iexec.exe -d -bootstrap ssh -hostos linux -host mpihost1 -n 1 hostname --exec --exec-appnum 0 --exec-pro
c-count 1 --exec-local-env 0 --exec-wdir C:\Windows\system32 --exec-args 1 hostname

[mpiexec@iriphost1] Launch arguments: F:\zzq\mpi\MinGW\msys\1.0\bin/ssh.exe -x -q mpihost1 pmi_proxy --co
ntrol-port iriphost1:60519 --debug --pmi-connect alltoall --pmi-aggregate -s 0 --rmk user --launcher ssh
--demux select --pgid 0 --enable-stdin 1 --retries 10 --control-code 9182 --usize -2 --proxy-id 0
[mpiexec@iriphost1] STDIN will be redirected to 1 fd(s): 4
bash: pmi_proxy: command not found

↧

Unable to generate trace file (*.stf)

March 7, 2018, 4:10 pm

Latest and popular articles on Intel Technologies

≫ Next: What's the expected slowdown for -gdb on MPI app?

≪ Previous: Cross Platform MPI start failed

Hello all,

I tried to generate trace file for performance profiling the code. I am testing in stampede2 with the module lists

Currently Loaded Modules:
1) git/2.9.0 3) xalt/1.7.7 5) intel/17.0.4 7) python/2.7.13 9) hdf5/1.8.16 11) petsc/3.7
2) autotools/1.1 4) TACC 6) impi/17.0.3 8) gsl/2.3 10) papi/5.5.1 12) itac/17.0.3

ITAC help shows how I can generate the trace file from certain simulation. Below is the configuration result from the code

# Whenever this version string changes, the application is configured
# and rebuilt from scratch
VERSION = stampede2-2017-10-03

CPP = cpp
FPP = cpp
CC = mpicc
CXX = mpicxx
F77 = ifort
F90 = ifort

CPPFLAGS = -g -trace -D_XOPEN_SOURCE -D_XOPEN_SOURCE_EXTENDED
FPPFLAGS = -g -trace -traditional
CFLAGS = -g -trace -traceback -debug all -xCORE-AVX2 -axMIC-AVX512 -align -std=gnu99
CXXFLAGS = -g -trace -traceback -debug all -xCORE-AVX2 -axMIC-AVX512 -align -std=gnu++11
F77FLAGS = -g -trace -traceback -debug all -xCORE-AVX2 -axMIC-AVX512 -align -pad -safe-cray-ptr
F90FLAGS = -g -trace -traceback -debug all -xCORE-AVX2 -axMIC-AVX512 -align -pad -safe-cray-ptr

LDFLAGS = -rdynamic -xCORE-AVX2 -axMIC-AVX512

C_LINE_DIRECTIVES = yes
F_LINE_DIRECTIVES = yes

VECTORISE = yes
VECTORISE_ALIGNED_ARRAYS = no
VECTORISE_INLINE = no

DEBUG = no
CPP_DEBUG_FLAGS = -DCARPET_DEBUG
FPP_DEBUG_FLAGS = -DCARPET_DEBUG
C_DEBUG_FLAGS = -O0
CXX_DEBUG_FLAGS = -O0
F77_DEBUG_FLAGS = -O0 -check bounds -check format
F90_DEBUG_FLAGS = -O0 -check bounds -check format

OPTIMISE = yes
CPP_OPTIMISE_FLAGS = # -DCARPET_OPTIMISE -DNDEBUG
FPP_OPTIMISE_FLAGS = # -DCARPET_OPTIMISE -DNDEBUG
C_OPTIMISE_FLAGS = -Ofast
CXX_OPTIMISE_FLAGS = -Ofast
F77_OPTIMISE_FLAGS = -Ofast
F90_OPTIMISE_FLAGS = -Ofast

CPP_NO_OPTIMISE_FLAGS =
FPP_NO_OPTIMISE_FLAGS =
C_NO_OPTIMISE_FLAGS = -O0
CXX_NO_OPTIMISE_FLAGS = -O0
CUCC_NO_OPTIMISE_FLAGS =
F77_NO_OPTIMISE_FLAGS = -O0
F90_NO_OPTIMISE_FLAGS = -O0

PROFILE = no
CPP_PROFILE_FLAGS =
FPP_PROFILE_FLAGS =
C_PROFILE_FLAGS = -pg
CXX_PROFILE_FLAGS = -pg
F77_PROFILE_FLAGS = -pg
F90_PROFILE_FLAGS = -pg

OPENMP = yes
CPP_OPENMP_FLAGS = -fopenmp
FPP_OPENMP_FLAGS = -fopenmp
C_OPENMP_FLAGS = -fopenmp
CXX_OPENMP_FLAGS = -fopenmp
F77_OPENMP_FLAGS = -fopenmp
F90_OPENMP_FLAGS = -fopenmp

WARN = yes
CPP_WARN_FLAGS =
FPP_WARN_FLAGS =
C_WARN_FLAGS =
CXX_WARN_FLAGS =
F77_WARN_FLAGS =
F90_WARN_FLAGS =

BLAS_DIR = NO_BUILD
BLAS_LIBS = -mkl

HWLOC_DIR = NO_BUILD
HWLOC_EXTRA_LIBS = numa

LAPACK_DIR = NO_BUILD
LAPACK_LIBS = -mkl

OPENBLAS_DIR = NO_BUILD
OPENBLAS_LIBS = -mkl

HDF5_DIR = /opt/apps/intel17/hdf5/1.8.16/x86_64

BOOST_DIR = /opt/apps/intel17/boost/1.64

GSL_DIR = /opt/apps/intel17/gsl/2.3

FFTW3_DIR = NO_BUILD
FFTW3_INC_DIRS = /opt/intel/compilers_and_libraries_2017.4.196/linux/mkl/include/fftw
FFTW3_LIBS = -mkl

PAPI_DIR = /opt/apps/papi/5.5.1

PETSC_DIR = /home1/apps/intel17/impi17_0/petsc/3.7/knightslanding
PETSC_LAPACK_EXTRA_LIBS = -mkl

PTHREADS_DIR = NO_BUILD

I am using mpicc/mpicxx with -trace flag to enable to use ITAC. Then, I use bellow to submit the job. First, runscript is generated

#! /bin/bash

echo "Preparing:"
set -x # Output commands
set -e # Abort on errors

cd @RUNDIR@-active

module unload mvapich2
module load impi/17.0.3
module list

echo "Checking:"
pwd
hostname
date

echo "Environment:"
#export I_MPI_FABRICS=shm:ofa
#export I_MPI_MIC=1
#export I_MPI_OFA_ADAPTER_NAME=mlx4_0
export CACTUS_NUM_PROCS=@NUM_PROCS@
export CACTUS_NUM_THREADS=@NUM_THREADS@
export CACTUS_SET_THREAD_BINDINGS=1
export CXX_MAX_TASKS=500
export GMON_OUT_PREFIX=gmon.out
export OMP_MAX_TASKS=500
export OMP_NUM_THREADS=@NUM_THREADS@
export OMP_STACKSIZE=8192 # kByte
export PTHREAD_MAX_TASKS=500
env | sort > SIMFACTORY/ENVIRONMENT
echo ${SLURM_NODELIST} > NODES

echo "Starting:"
export CACTUS_STARTTIME=$(date +%s)
export VT_PCTRACE=1
time ibrun -trace @EXECUTABLE@ -L 3 @PARFILE@

echo "Stopping:"
date

echo "Done."

As you can see above, I use iburn -trace. Then below is the submit script

#! /bin/bash
#SBATCH -A @ALLOCATION@
#SBATCH -p @QUEUE@
#SBATCH -t @WALLTIME@
#SBATCH -N @NODES@ -n @NUM_PROCS@
#SBATCH @("@CHAINED_JOB_ID@" != "" ? "-d afterany:@CHAINED_JOB_ID@" : "")@
#SBATCH -J @SHORT_SIMULATION_NAME@
#SBATCH --mail-type=ALL
#SBATCH --mail-user=@EMAIL@
#SBATCH -o @RUNDIR@/@SIMULATION_NAME@.out
#SBATCH -e @RUNDIR@/@SIMULATION_NAME@.err
cd @SOURCEDIR@
@SIMFACTORY@ run @SIMULATION_NAME@ --machine=@MACHINE@ --restart-id=@RESTART_ID@ @FROM_RESTART_COMMAND@

I guess that's what I need to generate trace file. I sent several simple jobs to check it but I didn't get any file after simulation. Simulation was completed without the problem so I am stuck now.

Does anyone have idea about this? The code I would like to check is called einstein-toolkit

↧

What's the expected slowdown for -gdb on MPI app?

March 7, 2018, 5:09 pm

Latest and popular articles on Intel Technologies

≫ Next: Benchmark With Broadwell

≪ Previous: Unable to generate trace file (*.stf)

I'm encountering a repeatable memory error that goes away as I increase the number of processes. I'm thinking that there is some static allocation or other memory limit that is being hit, but having more processes spreads the needed memory for each process to eventually fit into that limit. So, I wanted to use GDB to track down where there memory error is cropping up in order to fix the code. (The overall use of memory is only in the single digit percents of what's available when the code cracks.)

Without the '-gdb' option, I can run an instance of the code in just over 1 second. If I add the debugger flag, after I type "run" at the (mpigdb) prompt, I wait and wait and wait. Looking at 'top' in another window I see the mpiexec.hydra process pop up with 0.3% of CPU every once in a while. For example,

[clay@XXX src]$ time mpiexec -n 2 graph500_reference_bfs 15

real    0m1.313s
user    0m2.255s
sys     0m0.345s

[clay@XXX src]$ mpiexec -gdb -n 2 graph500_reference_bfs 15
mpigdb: np = 2
mpigdb: attaching to 1988 graph500_reference_bfs qc-2.oda-internal.com
mpigdb: attaching to 1989 graph500_reference_bfs qc-2.oda-internal.com
[0,1] (mpigdb) run
[0,1]   Continuing.
^Cmpigdb: ending..
[mpiexec@XXX] Sending Ctrl-C to processes as requested
[mpiexec@XXX] Press Ctrl-C again to force abort
[clay@XXX src]$

Do I need to just be more patient? If the real problem test case takes almost 500 seconds to reach the error point, how patient do I need to be? Or is there something else I need to be doing different to get things to execute in a timely manner? (I've tried to attach to one of the running process, but that didn't work at all.)

I was hoping to not need to resort to the most common debugger, the 'printf' statement, if I could help it. And using a debugger would elevate my skills in the eyes of management for me. :-)

Thanks.

--clay

↧

Benchmark With Broadwell

March 13, 2018, 4:19 am

Latest and popular articles on Intel Technologies

≫ Next: Trace Collector with ILP64 MKL and MPI libraries

≪ Previous: What's the expected slowdown for -gdb on MPI app?

Hi Team,

Need help to achieve the optimal result:

E5-2697 v4 @ 2.30GHz AVX 2.00 GHz

2.3 * 36 * 16 = 1324 ( TDP )
2.0 * 36 * 16 = 1152 ( AVX )

Intel(R) MPI Library for Linux* OS, Version 2017 Update 3 Build 20170405 (id: 17193)
Linux master.local 3.10.0-693.5.2.el7.x86_64
CentOS Linux release 7.4.1708 (Core)

Two Node Result

================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR11C2R4 231168 192 8 9 7936.44 1.03770e+03

mpirun -print-rank-map -np 72 -genv I_MPI_DEBUG 5 -genv I_MPI_FALLBACK_DEVICE 0 -genv I_MPI_FABRICS shm:dapl --machinefile $PBS_NODEFILE /opt/apps/intel/mkl/benchmarks/mp_linpack/xhpl_intel64_static

Single node Performance
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR11C2R4 163200 192 6 6 4123.17 7.02820e+02

Need your support

Thank You

↧

Trace Collector with ILP64 MKL and MPI libraries

March 14, 2018, 11:42 am

Latest and popular articles on Intel Technologies

≫ Next: Intel MPI segmentation fault bug

≪ Previous: Benchmark With Broadwell

Hi,

Is it possible to use the Intel Trace Collector (on linux) with the ILP64 MKL and MPI libraries? I see on the MPI page

https://software.intel.com/en-us/mpi-developer-reference-linux-ilp64-sup...

the statement

"If you want to use the Intel® Trace Collector with the Intel MPI Library ILP64 executable files, you must use a special Intel Trace Collector library. If necessary, the mpiifort compiler wrapper will select the correct Intel Trace Collector library automatically."

I don't really understand whether this means 1) there are special Trace Collector Libraries available or 2) you somehow have to generate your own special library. I can find no information in the Trace Collector documentation itself concerning ILP64 support.

Thanks,

John

↧

Intel MPI segmentation fault bug

March 15, 2018, 3:44 am

Latest and popular articles on Intel Technologies

≫ Next: MPI stat

≪ Previous: Trace Collector with ILP64 MKL and MPI libraries

I have come across a bug in Intel MPI when testing in a docker container with no numa support. It appears that the case of no numa support is not being handled correctly. More details below

Thanks

Jamil

icc --version
icc (ICC) 17.0.6 20171215

gcc --version
gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)

uname -a
Linux centos7dev 4.9.60-linuxkit-aufs #1 SMP Mon Nov 6 16:00:12 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

bug.c

#include "mpi.h"

int main (int argc, char *argv[])
{
MPI_Init(&argc,&argv);
}

I_MPI_CC=gcc mpicc -g bug.c -o bug

gdb ./bug

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b64f45 in __I_MPI___intel_sse2_strtok () from /opt/intel/compilers_and_libraries_2017.6.256/linux/mpi/intel64/lib/libmpifort.so.12
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-16.el7_4.2.x86_64 numactl-devel-2.0.9-6.el7_2.x86_64
(gdb) bt
#0 0x00007ffff7b64f45 in __I_MPI___intel_sse2_strtok () from /opt/intel/compilers_and_libraries_2017.6.256/linux/mpi/intel64/lib/libmpifort.so.12
#1 0x00007ffff70acab1 in MPID_nem_impi_create_numa_nodes_map () at ../../src/mpid/ch3/src/mpid_init.c:1355
#2 0x00007ffff70ad994 in MPID_Init (argc=0x1, argv=0x7ffff72a2268, requested=-148233624, provided=0x1, has_args=0x0, has_env=0x0)
at ../../src/mpid/ch3/src/mpid_init.c:1733
#3 0x00007ffff7043ebb in MPIR_Init_thread (argc=0x1, argv=0x7ffff72a2268, required=-148233624, provided=0x1) at ../../src/mpi/init/initthread.c:717
#4 0x00007ffff70315bb in PMPI_Init (argc=0x1, argv=0x7ffff72a2268) at ../../src/mpi/init/init.c:253
#5 0x00000000004007e8 in main (argc=1, argv=0x7fffffffcd58) at bug.c:6

↧

MPI stat

March 15, 2018, 10:01 pm

Latest and popular articles on Intel Technologies

≫ Next: How to use Intel MPI to create system resources such as Opengl windows, system shared memory on Windows 10?

≪ Previous: Intel MPI segmentation fault bug

I want to generate timing log on mpi functions. I am using "export I_MPI_STATS=20" to enable log. This is capturing timing info only on one node. How to get similar information from all nodes that are used in execution.

Thanks

Biren

↧

How to use Intel MPI to create system resources such as Opengl windows, system shared memory on Windows 10?

March 16, 2018, 1:08 am

Latest and popular articles on Intel Technologies

≫ Next: Issue with MPI_ALLREDUCE with MPI_REAL16

≪ Previous: MPI stat

Our school project needs MPI, OpenGL, but in our attempt, we failed to create OpenGL window and system shared memory in Intel MPI process. Could anyone help us.

Our os is Windows 10.

↧

Issue with MPI_ALLREDUCE with MPI_REAL16

March 20, 2018, 5:24 am

Latest and popular articles on Intel Technologies

≫ Next: PETSc 3.8 build: internal error: 0_76

≪ Previous: How to use Intel MPI to create system resources such as Opengl windows, system shared memory on Windows 10?

Hello!

I am running a quad precision code in MPI. However, when I perform MPI_ALLREDUCE with MPI_REAL16 as datatype, the code gives a segmentation fault. How do I incorporate quad precision reduction operations in MPI. Any advice would be greatly appreciated.

Regards

Suman Vajjala

↧

PETSc 3.8 build: internal error: 0_76

March 22, 2018, 3:35 pm

Latest and popular articles on Intel Technologies

≫ Next: execvp error

≪ Previous: Issue with MPI_ALLREDUCE with MPI_REAL16

I'm attempting a PETSc 3.8 build with Intel Parallel Studio 2017.0.5. The build fails without much information, but it appears to be an internal compiler error.

Some key output:

...
/home/cchang/Packages/petsc-3.8/src/vec/is/sf/impls/basic/sfbasic.c(528): (col. 1) remark: FetchAndInsert__blocktype_int_4_1 has been targeted for automatic cpu dispatch
": internal error: 0_76
compilation aborted for /home/cchang/Packages/petsc-3.8/src/vec/is/sf/impls/basic/sfbasic.c (code 4)
gmake[2]: *** [impi-intel/obj/src/vec/is/sf/impls/basic/sfbasic.o] Error 4

Could you tell me what this error 0_76 is? I can provide log files or environment info if these will help.

Thanks,

Chris

↧

execvp error

April 1, 2018, 5:09 pm

Latest and popular articles on Intel Technologies

≫ Next: Help With Very Slow Intel MPI Performance

≪ Previous: PETSc 3.8 build: internal error: 0_76

Gentlemen, could you please help with an issue?

I´m using intel compiler ifort version 18.0.2 and intel mpi version 2018.2.199 in a attempt to run WRF model on HPE (former SGI) ICE X machine.

wrfoperador@dpns31:~> ifort -v
ifort version 18.0.2

wrfoperador@dpns31:~> mpirun -V
Intel(R) MPI Library for Linux* OS, Version 2018 Update 2 Build 20180125 (id: 18157)
Copyright 2003-2018 Intel Corporation.
wrfoperador@dpns31:~>

When I run the executable I receive the following message:

/opt/intel/intel_2018/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin/mpirun r1i1n0 12 /home/wrfoperador/wrf/wrf_metarea5/WPS/geogrid.exe
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)
[proxy:0:0@dpns31] HYDU_create_process (../../utils/launch/launch.c:825): execvp error on file r1i1n0 (No such file or directory)

Could you help me to solve this problem?

Thanks for your attention. I´m looking forward to your reply.

↧

Help With Very Slow Intel MPI Performance

April 5, 2018, 5:38 pm

Latest and popular articles on Intel Technologies

≫ Next: Problem with NFS Over RDMA on OmniPath

≪ Previous: execvp error

All,

I'm hoping the Intel MPI gurus can help with this. Recently I've tried transitioning some code I help maintain (GEOS, a climate model) from using HPE MPT (2.17, in this case) to Intel MPI (18.0.1; 18.0.2 I'll test soon). In both cases, the compiler (Intel 18.0.1) is the same, both running on the same set of Haswell nodes on an SGI/HPE cluster. The only difference is the MPI stack.

Now one part of the code (AGCM, the physics/dynamics part) is actually a little bit faster with Intel MPI than MPT, even on an SGI machine. That is nice. It's maybe 5-10% faster in some cases. Huzzah!

But, another code (GSI, analysis of observation data) really, really, really does not like Intel MPI. This code displays two issues. First, after the code starts (both launch very fast) it eventually hits a point at which, we believe, the first collective occurs at which point the whole code stalls as it...initializes buffers? Something with Infiniband maybe? We don't know. MPT slows a bit too, but doesn't show this issue nearly as badly as IMPI. We had another place like this in the AGCM where moving from a collective to an Isend/Recv/Wait type paradigm really helped. This "stall" is annoying and, worse, it gets longer and longer as the number of cores increase. (We might have a reproducer for this one.)

But, that is minor really. A minute or so, compared to the overall performance. On 240 cores, MPT 2.17 runs this code in 15:03 (minutes:seconds), Intel MPI 18.0.1, 28:12. On 672 cores, MPT 2.17 runs the code in 12:02 and Intel MPI 18.0.2 in 21:47; doesn't scale well overall for either.

Using I_MPI_STATS, the code is seen to be ~60% MPI in Alltoallv (20% of wall) at 240 cores; at 672, Barrier starts to win, but Alltoallv is still 40% MPI, 23% walltime. I've tried running by setting both I_MPI_ADJUST_ALLTOALLV options (1 and 2) and it does little at all (28:44 and 28:25 at 240).

I'm going to try and see if I can request/reserve a set of nodes for a long time to do an mpitune run, but since each run is ~30 minutes...mpitune will not be fun as it'd be 90 minutes for each option test.

Any ideas on what might be happening? Any advice for flags/environment variables to try? I understand that HPE MPT might/should work best on an SGI/HPE machine (like how Intel compilers seem to do best with Intel chips), but this seems a bit beyond the usual difference. I've requested MVAPICH2 be installed as well for another comparison.

Matt

↧

Problem with NFS Over RDMA on OmniPath

April 5, 2018, 10:21 pm

Latest and popular articles on Intel Technologies

≫ Next: where are the others 4 cores?

≪ Previous: Help With Very Slow Intel MPI Performance

I have been trying to setup NFS over RDMA on OmniPath following instruction in official document. IPoIB works fine, but I cannot get NFS over RDMA working. I have modified /etc/rdma/rdma.conf and added

NFSoRDMA_LOAD=yes
NFSoRDMA_PORT=2050

I have also loaded appropriate modules ( sunrpc on the client, xprtrdma on the server). However, the NFS mount fails (connection refused) when using RDMA to mount, note that it works fine if I do not specify rdma.

It appears that the 2050 port for NFSoRDMA does not get created, when I do rpcinfo from the client to examine the server I see ports 2049 for nfs, but nothing for 2050.

This is on CentOS 7.4. Any ideas/suggestions what may be wrong?

↧