Severe Memory Leak with 2019 impi

November 17, 2018, 7:05 am

Latest and popular articles on Intel Technologies

≫ Next: The parameter localroot is not recognized at start of run

≪ Previous: New MPI error with Intel 2019.1, unable to run MPI hello world

Both 2019 impi releases have a severe memory leak which goes away when I regress to the 2015 version (i.e. source /opt/intel/comp2015/impi/5.0.2.044/intel64/bin/mpivars.sh). I am attaching two valgrind outputs, lapw1.vg.285276 from 2019 impi and lapw1.vg.5451 from 2015 impi which show it quite clearly.

For reference, the entries in the valgrind logs with "init_parallel_ (in /opt/Wien2k_18.1F/lapw1Q_mpi)" are the mpi initialization, so these are almost certainly not real as the initialization is only done once. The entries associated with the scalapack pdsygst call are probably the culprit.

If needed I can provide a package to reproduce this. It is part of a large code, so decomposing into a small test code is not feasible.

Attachment	Size
Download lapw1.vg_.5451.txt	813.61 KB
Download lapw1.vg_.285276.txt	1.14 MB

↧

The parameter localroot is not recognized at start of run

November 29, 2018, 11:30 pm

Latest and popular articles on Intel Technologies

≫ Next: -perhost parameter forgotten after first iteration over all hosts

≪ Previous: Severe Memory Leak with 2019 impi

Hi,

I use MPI to parallellize parts in my Quickwin project run with Fortran 2019 Cluster Edition. Earlier I was helped by

Intel to manage my QuickkWin graphics output by using the parameter localroot. This works fine with the 2017 update 4 version. When I goto

the 2019 version, localroot is not recognized. No change has been introduced in the command file starting the execution.

The error report is attached where the command file is exposed.

Best regards

Anders S

Attachment	Size
Download Error for fortran 2019.PNG	79.47 KB

↧

-perhost parameter forgotten after first iteration over all hosts

November 30, 2018, 5:13 am

Latest and popular articles on Intel Technologies

≫ Next: Intel MPI Not running on windows 10 Dell 7920 workstation

≪ Previous: The parameter localroot is not recognized at start of run

Dear developers,

the round-robin placement forgets about the perhost parameter once it iterated over all hosts in the hostfile.
This was tested with Intel MPI 2019.1.

My hostfile looks like:

node551
node552

And when I start a small job, I get:

I_MPI_DEBUG=4 I_MPI_PIN_DOMAIN=core mpirun -f hostfile -n 8 -perhost 2  ./a.out
[0] MPI startup(): libfabric version: 1.7.0a1-impi
[0] MPI startup(): libfabric provider: verbs;ofi_rxm
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       377136   node551   {0,40}
[0] MPI startup(): 1       377137   node551   {1,41}
[0] MPI startup(): 2       151304   node552   {0,40}
[0] MPI startup(): 3       151305   node552   {1,41}
[0] MPI startup(): 4       377138   node551   {2,42}
[0] MPI startup(): 5       151306   node552   {2,42}
[0] MPI startup(): 6       377139   node551   {3,43}
[0] MPI startup(): 7       151307   node552   {3,43}

ranks 0-3 are distributed as expected, but ranks 4-7 are distributed across the hosts as if the perhost parameter is reset to 1.

↧

Intel MPI Not running on windows 10 Dell 7920 workstation

December 4, 2018, 12:40 am

Latest and popular articles on Intel Technologies

≫ Next: pgCC binder for MPI

≪ Previous: -perhost parameter forgotten after first iteration over all hosts

I have to run a parallel job and usually after installing IntelMPI the software package used to run Parallel processor jobs. I don't have much idea around the configuration of Intel MPI. Need help to troubleshoot MPI and configure it.

error comes when program starts to launch MPI communications.

↧

pgCC binder for MPI

December 5, 2018, 6:23 am

Latest and popular articles on Intel Technologies

≫ Next: How can I CreateProcessAsUser by hydra_service

≪ Previous: Intel MPI Not running on windows 10 Dell 7920 workstation

I'm trying to compile the binding libraries for the PGI C++ compiler. In the readme the following is stated:

II.2.2. C++ Binding
To create the Intel(R) MPI Library C++ binding library using the
PGI* C++ compiler, do the following steps:
1. Make sure that the PGI* C++ compiler (pgCC) is in your PATH.
2. Go to the directory cxx
3. Run the command
   # make MPI_INST=<MPI_path> CXX=<C++_compiler> NAME=<name> \
     [ARCH=<arch>] [MIC=<mic option>]
   with
   <MPI_path>        - installation directory of the Intel(R) MPI Library
   <C++_compiler>    - compiler to be used
   <name>            - base name for the libraries and compiler script
   <arch>            - set `intel64` or `mic` architecture, `intel64` is used by
                       default
   <mic option>      - compiler option to generate code for Intel(R) MIC
                       Architecture. Availalbe only when ARCH=mic is set, `-mmic`
                       is used by default in such case
4. Copy the resulting <arch> directory to the Intel(R) MPI Library installation
   directory.

Am I trying to compile with the following statement:

make MPI_INST=/prog/Intel/studioxe2016/compilers_and_libraries_2016.3.210/linux/mpi CXX=pgCC NAME=pgCC

which gives this output:

pgCC  -c -fpic -I/prog/Intel/studioxe2016/compilers_and_libraries_2016.3.210/linux/mpi/intel64/include -Iinclude -Iinclude/intel64 -o initcxx.o initcxx.cxx
"include/intel64/mpichconf.h", line 1362: catastrophic error: cannot open
          source file "nopackage.h"
  #include "nopackage.h"
                        ^

1 catastrophic error detected in the compilation of "initcxx.cxx".
Compilation terminated.
make: *** [initcxx.o] Error 2

Does anybody have an idea where I can get this nopackage.h, or why this error comes?

I have successfully compiled binders for both pgc and pgf90 without any issues.

↧

How can I CreateProcessAsUser by hydra_service

December 5, 2018, 11:51 pm

Latest and popular articles on Intel Technologies

≫ Next: MPI and Quantum Espresso

≪ Previous: pgCC binder for MPI

hi,

I Create a windows_shared_memory in user application, and open it in other process which launched by mpiexec.exe。

The problem is the mpi process can not open windows_shared_memory (ERROR _NOT_FOUND_FILE).

i think the reason is the user application under session 1, while the mpi process under session 0（because hydra_service under session 0）。

how can i do.

best wishes.

↧

MPI and Quantum Espresso

December 10, 2018, 12:13 pm

Latest and popular articles on Intel Technologies

≫ Next: IRECV/SSEND crashes for Intel MPI Library 2019

≪ Previous: How can I CreateProcessAsUser by hydra_service

Dear experts,

I am having difficulty using MPI from parallel studio cluster edition 2016 in conjunction with Quantum Espresso PWSCF v 6.3.

I think the problems may be inter-related and are to do with MPI-communicators. I compiled pw.x with Intel compilers, with Intel MPI, Intel ScaLapack and MKL, but without OpenMP.

I have been running pw.x with multiple processes quite successfully, however when the number of processes is high enough, such that the space group has more than 7 processes, where the subspace diagonalization no longer uses a serial algorithm, the program crashes abruptly at about the 10th iteration with the following errors;

Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7ffe0b27a6e8, comm_new=0x7ffe0b27a640) failed PMPI_Cart_sub(178)...................: MPIR_Comm_split_impl(270)............: MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on this process; ignore_id=0) Fatal error in PMPI_Cart_sub: Other MPI error, error stack: PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3, remain_dims=0x7ffefaee7ce8, comm_new=0x7ffefaee7c40) failed PMPI_Cart_sub(178)...................: MPIR_Comm_split_impl(270)............:

On the pw forum, I got this response;

'a careful look at the error message reveals, that you are running out of space for MPI communicators for which a fixed maximum number (16k) seems to be allowed. this hints at a problem somewhere that communicators are generated with MPI_Comm_split() and not properly cleared afterwards.'

But I don't know how to fix this..

Please kindly advise,

Many thanks

Alex Durie

PhD student

↧

IRECV/SSEND crashes for Intel MPI Library 2019

January 7, 2019, 11:29 am

Latest and popular articles on Intel Technologies

≫ Next: mpiexec -hosts differences in MPICH and Intel MPI

≪ Previous: MPI and Quantum Espresso

Hi,

I noticed that one of our MPI codes begin crashing after installing Intel Parallel Studio XE 2019 (Intel MPI Library 2019 Update 1) on Windows. I tracked down the issue to a combination of SSEND/IRECV when the transferred data reaches a certain size. Test code exhibiting the crash is attached. The code does not crash when using Intel Parallel Studio XE 2018 (Intel MPI Library Update 3).

In particular, the 2019 library exhibits a crash when the double precision (square) matrix being transferred has a dimension of around 360-365 in the vicinity of 135K total elements. The crash occurs for both the 4-byte and 8-byte MPI interfaces. My compile and dispatch commands are

mpiifort -fpp -DMPI_MPI_INTEGER_TYPE=4 -DMPI_SYS_INTEGER_TYPE=4 test.F90
mpiexec -n 2 ./test.exe

for the 4-byte interface and

mpiifort -ilp64 -i8 -fpp -DMPI_MPI_INTEGER_TYPE=8 -DMPI_SYS_INTEGER_TYPE=8 test.F90
mpiexec -n 2 ./test.exe

for the 8-byte interface.

Any help or suggested workaround is much appreciated.

Thanks,

John

Attachment	Size
Download test.F90	3.29 KB

↧

mpiexec -hosts differences in MPICH and Intel MPI

January 11, 2019, 11:35 am

Latest and popular articles on Intel Technologies

≫ Next: dapl async_event QP

≪ Previous: IRECV/SSEND crashes for Intel MPI Library 2019

What is differences in handling mpiexec -hosts in MPICH and Intel MPI? It seems that Inte MPI doesn't recognize <host>:<number processes> syntax.

↧

dapl async_event QP

January 14, 2019, 3:24 pm

Latest and popular articles on Intel Technologies

≫ Next: Execution error using the Educator Intel Parallel Studio XE Cluster Development tools for Linux UBUNTU 18.04

≪ Previous: mpiexec -hosts differences in MPICH and Intel MPI

Hello

I am facing the following errors on intel/2018.2, with intelmpi/2018.2 using mpiexec to submit my cluster simulations.

dapl async_event CQ (0x1750ff0) ERR 0
dapl_evd_cq_async_error_callback (0x169ada0, 0x16cf460, 0x2ab4fecb9d30, 0x1750ff0)
dapl async_event QP (0x1fdacc0) Event 1

After this point my runs terminate. Any assistance with resolving this error would be much appreciated.

Alexandra

↧

Execution error using the Educator Intel Parallel Studio XE Cluster Development tools for Linux UBUNTU 18.04

January 16, 2019, 8:01 am

Latest and popular articles on Intel Technologies

≫ Next: How to add a new node to an installed Intel Parallel Studio cluster

≪ Previous: dapl async_event QP

I just loaded the Educator Intel Parallel Studio XE Cluster Development tools for Linux UBUNTU 18.04. It compiles and runs C, C++ and Fortran files fine, but when I used the iMPi library I get the error free(): invalid next size (fast) error. It appears that most of the documented errors are due to memory allocation errors, but not in my case since I don't explicitly allocate memory . The source code I tested with is the hello world program for mpi as follows:

#include <stdio.h>
#include <stdlib.h>

#include "mpi.h"

int main( int argc, char *argv[]) {
int rank;

MPI_Init( &argc, &argv);
MPI_Comm_rank( MPI_COMM_WORLD, &rank);

   printf("rank:%d Hello World.\n", rank);
   MPI_Finalize();
   return 0;
}

It returns exit code 134 with the error "free(): invalid next size (fast) error\n Aborted (core dumped)" . Using strace as follows:

strace mpirun -n 1 ./hello_mpi

I get the error message trying to start mpiexec.hyra

stat("/opt/intel//compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpiexec.hydra", {st_mode=S_IFREG|0755, st_size=1887795, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6851924a10) = 7499
wait4(-1, free(): invalid next size (fast)
[{WIFSIGNALED(s) && WTERMSIG(s) == SIGABRT && WCOREDUMP(s)}], 0, NULL) = 7499
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=7499, si_uid=1000, si_status=SIGABRT, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]}) = 7499
write(2, "Aborted (core dumped)\n", 22Aborted (core dumped)
) = 22
pipe([3, 4]) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f6851924a10) = 7504

Running ldd on the executable I get

ldd hello_mpi
    linux-vdso.so.1 (0x00007ffdffdfe000)
    libmpi.so.12 => /opt/intel//compilers_and_libraries_2019.1.144/linux/mpi/intel64/lib/release/libmpi.so.12 (0x00007fbcc44d9000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fbcc40e8000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fbcc3ee0000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fbcc3cdc000)
    libfabric.so.1 => /opt/intel//compilers_and_libraries_2019.1.144/linux/mpi/intel64/libfabric/lib/libfabric.so.1 (0x00007fbcc3aa3000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fbcc388b000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fbcc7968000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fbcc366c000)

My PATH looks good

which mpicc
/opt/intel//compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpicc

which mpirun
/opt/intel//compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpirun

I tried installing the suite both with and without IA-32 support but I get the same error regardless.

I have run this with source compiled mpich 3-3 (using GNU-8.2.0 compilers) and it works great so I think there is some issue in your hydra implementation.

Thanks for any support.

--Mike

↧

How to add a new node to an installed Intel Parallel Studio cluster

January 18, 2019, 1:22 am

Latest and popular articles on Intel Technologies

≫ Next: Can I install and make parallel studio xe cluster edition available on a cluster?

≪ Previous: Execution error using the Educator Intel Parallel Studio XE Cluster Development tools for Linux UBUNTU 18.04

Hello, I am running Intel parallel_studio_xe_2019_update1_cluster_edition on Linux with my student license and I have finished a cluster installation with specific nodes file. But now, my cluster is running and I need to add a node to the cluster without any effect to running node. (I mean no job paused). Can you help me to get that?

↧

Can I install and make parallel studio xe cluster edition available on a cluster?

January 18, 2019, 8:07 am

Latest and popular articles on Intel Technologies

≫ Next: integration problem between Torque 4 and Intel(R) MPI Library for Linux* OS, Version 2019 Update 1

≪ Previous: How to add a new node to an installed Intel Parallel Studio cluster

Hi,

I am using a free student version on a HPC linux cluster of an academic institute. Now other users from this institute also want to use the parallel studio xe package for their academic research.

I see the price for license on https://softwarestore.intel.com/SuiteSelection/ParallelStudio, but we are using the parallel studio (compilers and libraries) to study the software and do research work. There is no need for any support. In this case can I install the package and make it available to other users on the cluster?

Thanks in advance.

best regards,

Dr. Hong Li

↧

integration problem between Torque 4 and Intel(R) MPI Library for Linux* OS, Version 2019 Update 1

January 19, 2019, 9:03 am

Latest and popular articles on Intel Technologies

≫ Next: Conda impi_rt=2019.1 doesn't substitute I_MPI_ROOT in bin/mpivars.sh

≪ Previous: Can I install and make parallel studio xe cluster edition available on a cluster?

Hi!

I have successfully compiled and linked a program with IntelMPI and if I run it interactively or in background it runs very fast and without any problems on our new server (ProLiant DL580 Gen10, 1 node with 4 processors with 18 cores each, total 72 cores, hyperthreading disabled). If I try to submit it by Torque (version 4) strange things happen, for example:

1) if I submit 2 jobs asking each 8 cores they are both fine

2) if I submit a third job (8 cores) it is 4 times slower becasue the 8 process runs on two cores!

3) if I submit a fourth job it runs properly, but if I qdel all the four jobs, all of them disappear from qstat -a but the fourth is keeping running!

From previous discussion I notice in this forum, I have the feeling it is an integration problem between intelmpi and torque, so I did the following:

export I_MPI_PIN=off
export I_MPI_PIN_DOMAIN=socket

to run the program I did the following call of mpirun:

/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpirun -d -rmk pbs -bootstrap pbsdsh .................

I have checked and PBS_ENVIRONMENT is properly set to PBS_BATCH

Also torque configuration is apparently correct, the file

/var/lib/torque/server_priv/nodes contains the following line:

dscfbeta1.units.it np=72 num_node_boards=1

This is a severe problem for me, since the machine is shared so we do need a scheduler like torque (pbs) to run jobs compiled and linked to intelmpi. Any help suggestion is welcome!

thank you in advance

Mauro

↧

Conda impi_rt=2019.1 doesn't substitute I_MPI_ROOT in bin/mpivars.sh

January 20, 2019, 9:26 am

Latest and popular articles on Intel Technologies

≫ Next: integer overflow for MPI_COMM_WORLD ref-counting in MPI_Iprobe

≪ Previous: integration problem between Torque 4 and Intel(R) MPI Library for Linux* OS, Version 2019 Update 1

Not sure where to report this bug, but this forces me stick with intelpython 2018.0.3. The steps to reproduce are

conda config --add channels intel
conda create -n test impi_rt=2019.1

You will find /path/to/envs/test/bin/mpivars.sh has I_MPI_ROOT not substituted correctly.

Or is conda no longer the supported way to install Intel Performance Libraries? If so, what's the most future proof way? Or if it is the best way, where should I report this bug? Thanks.

↧

integer overflow for MPI_COMM_WORLD ref-counting in MPI_Iprobe

January 21, 2019, 10:06 am

Latest and popular articles on Intel Technologies

≫ Next: Bad Termination Error Exit Code 4

≪ Previous: Conda impi_rt=2019.1 doesn't substitute I_MPI_ROOT in bin/mpivars.sh

Calling 2^31 times MPI_Iprobe results in the following error:

Abort(201962501) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Iprobe: Invalid communicator, error stack:
PMPI_Iprobe(123): MPI_Iprobe(src=MPI_ANY_SOURCE, tag=MPI_ANY_TAG, MPI_COMM_WORLD, flag=0x7ffd925056c0, status=0x7ffd92505694) failed
PMPI_Iprobe(90).: Invalid communicator

On our system it takes about 10 Minutes to perform this number of calls in a loop.

The affected version is IntelMPI 2019.1.144 (based on MPICH 3.3)

The expected behavior is that MPI_Iprobe is neutral for the reference count of the provided communicator. Especially for MPI_COMM_WORLD, the reference count is superflous.

↧

Bad Termination Error Exit Code 4

January 22, 2019, 12:26 pm

Latest and popular articles on Intel Technologies

≫ Next: Intel MPI with Distributed Ansys Mechanical

≪ Previous: integer overflow for MPI_COMM_WORLD ref-counting in MPI_Iprobe

Hi,

I have a binary which was compiled on Haswell using Intel 16.0 and IMPI 5.1.1. It runs fine on Haswell. But when I try to run it on Skylake nodes, the binary crashed right away with this error

==================================================================================

= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES

= PID 99283 RUNNING AT iforge127

= EXIT CODE: 4

= CLEANING UP REMAINING PROCESSES

= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

I understand the issue may be with the application, but would like to know how to debug this and resolve the issue. Thank you for the help.

Regards,

↧

Intel MPI with Distributed Ansys Mechanical

January 30, 2019, 9:28 am

Latest and popular articles on Intel Technologies

≫ Next: How should I edit machines.LINUX file for my cluster?

≪ Previous: Bad Termination Error Exit Code 4

Did anyone can share a successful story of running distributed Ansys (Mechanical) with intel MPI in windows 10 between two pcs?

A long story to short, I could launch a distributed analysis in a single pc with intel mpi, but couldn't launch a distributed analysis between two pcs, but IBM mpi does.

Here are what I did so far (and wish I can get guides from you)

Hardware: Two Dell workstations, same cpu, ram, and everything.

OS: windows 10

Intel mpi library: 2017 update 3

After installing Intel mpi library and finishing setting up environmental varies and catch password on each machine, I did a test "mpiexec -n 4 -ppn 2 -machine machines.txt test" , and get the following feedback which indicates intel mpi is communicated successfully between two pcs.

Hello world: rank 0 of 4 running on node1
Hello world: rank 1 of 4 running on node2
Hello world: rank 2 of 4 running on node1
Hello world: rank 3 of 4 running on node2

Did the same test on each pc with the command "ansys192 -np 2 -mpitest", and both pcs show "MPI Test has completed successfully!"

However, when I run the distributed test "ansys192 -machine machines.txt -mpitest", it looks like Ansys still takes the test as a single pc test, as the info shown below:

Mechanical APDL execution Command: mpiexec -np 2 -genvlist ANS_USER_PATH,ANSWAIT,ANSYS_SYSDIR,ANSYS_SYSDIR32,ANSYS192_DIR,ANSYSLI_RESERVE_ID,ANSYSLI_USAGE,AWP_LOCALE192,AWP_ROOT192,CADOE_DOCDIR192,CADOE_LIBDIR192,LSTC_LICENSE,P_SCHEMA,PATH,I_MPI_COLL_INTRANODE,I_MPI_AUTH_METHOD -localroot "C:\Program Files\ANSYS Inc\v192\ANSYS\bin\winx64\MPITESTINTELMPI.EXE" -machine machines.txt -mpitest

I appreciate all your feedback, Thank you!

↧

How should I edit machines.LINUX file for my cluster?

January 28, 2019, 4:53 pm

Latest and popular articles on Intel Technologies

≫ Next: MPI Crashing

≪ Previous: Intel MPI with Distributed Ansys Mechanical

Hello everybody:

I am a new user for cluster, recently I updated intel composer XE 2013 to compile fortran,

I found in Readme.txt which says I need a machines.LINUX file to make sure I can use every node to run fortran program.

How should I edit the machines.LINUX file correctly? I had found some example, e.g.

BASH: cluster_prereq_is_remote_dir_mounted(): compute-11-37 <- /opt/intel -> compute-12-26
BASH: cluster_prereq_is_remote_dir_mounted(): compute-11-37 <- /opt/intel -> compute-12-27
BASH: cluster_prereq_is_remote_dir_mounted(): compute-11-37 <- /opt/intel -> compute-12-28
...

clusternode01

clusternode02

clusternode03

...

what is the format correct?I am very confuse about that, please help me, thanks so much!!

↧

MPI Crashing

February 6, 2019, 1:03 pm

Latest and popular articles on Intel Technologies

≫ Next: intel mpi crash at many ranks

≪ Previous: How should I edit machines.LINUX file for my cluster?

Hello,

I recently upgraded my os to Ubunto 18.04 and I have problems since.

Right now I reformatted my desktop and installed a fresh version of Ubuntu 18.04 and Installed intel C++ compiler and MPI library 2019 version 2.

When I run my codes, after a couple of hours and thousands of time steps I get the following error message:

Abort(873060101) on node 15 (rank 15 in comm 0): Fatal error in PMPI_Recv: Invalid communicator, error stack:
PMPI_Recv(171): MPI_Recv(buf=0x4b46a00, count=36912, MPI_DOUBLE, src=14, tag=25, MPI_COMM_WORLD, status=0x1) failed
PMPI_Recv(103): Invalid communicator
[cli_15]: readline failed

My code used to run fine on Ubuntu 16.04 (with older version of Intel's compiler and MPI), and also runs well on various big clusters.

My code uses Isend for sending information and Recv for reciving. Throughout my code I only use MPI_COMM_WORLD communicator and I never create a new one.

Can you pls help me find out what's wrong?

Thank you,

Elad

↧