Hi,
I am unable to launch a simple MPI application using more than 80 processes on a single host using Intel MPI 2018 update 1 and PBS Pro as job scheduler.
The job is submitted with a script containing:
#PBS -l select=81:ncpus=1 mpiexec.hydra -n 81 -ppn 1 ./a.out
In the call to MPI_Init, the following error is raised on rank 80:
[cli_80]: write_line error; fd=255 buf=:cmd=init pmi_version=1 pmi_subversion=1 : system msg for write_line failure : Bad file descriptor [cli_80]: Unable to write to PMI_fd [cli_80]: write_line error; fd=255 buf=:cmd=barrier_in : system msg for write_line failure : Bad file descriptor [cli_80]: write_line error; fd=255 buf=:cmd=get_ranks2hosts : system msg for write_line failure : Bad file descriptor [cli_80]: expecting cmd="put_ranks2hosts", got cmd="" Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(805): fail failed MPID_Init(1743)......: channel initialization failed MPID_Init(2144)......: PMI_Init returned -1 [cli_80]: write_line error; fd=255 buf=:cmd=abort exitcode=68204815 : system msg for write_line failure : Bad file descriptor
I looked closer into the issue by running the application through strace. The output for rank 80 shows that the process tries to read from the bash internal file descriptor 255:
<snip> uname({sys="Linux", node="uvtk", ...}) = 0 sched_getaffinity(0, 128, { 0, 0, 0, 0, 80000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 128 write(255, "cmd=init pmi_version=1 pmi_subve"..., 40) = -1 EBADF (Bad file descriptor) write(2, "[cli_80]: ", 10[cli_80]: ) = 10 write(2, "write_line error; fd=255 buf=:cm"..., 72write_line error; fd=255 buf=:cmd=init pmi_version=1 pmi_subversion=1 : ) = 72 write(2, "system msg for write_line failur"..., 56system msg for write_line failure : Bad file descriptor ) = 56 write(2, "[cli_80]: ", 10[cli_80]: ) = 10 write(2, "Unable to write to PMI_fd\n", 26Unable to write to PMI_fd ) = 26 uname({sys="Linux", node="uvtk", ...}) = 0 write(255, "cmd=barrier_in\n", 15) = -1 EBADF (Bad file descriptor) write(2, "[cli_80]: ", 10[cli_80]: ) = 10 write(2, "write_line error; fd=255 buf=:cm"..., 47write_line error; fd=255 buf=:cmd=barrier_in : ) = 47 write(2, "system msg for write_line failur"..., 56system msg for write_line failure : Bad file descriptor ) = 56 write(255, "cmd=get_ranks2hosts\n", 20) = -1 EBADF (Bad file descriptor) write(2, "[cli_80]: ", 10[cli_80]: ) = 10 write(2, "write_line error; fd=255 buf=:cm"..., 52write_line error; fd=255 buf=:cmd=get_ranks2hosts : ) = 52 write(2, "system msg for write_line failur"..., 56system msg for write_line failure : Bad file descriptor ) = 56 read(255, 0x7fffe00d6320, 1023) = -1 EBADF (Bad file descriptor) write(2, "[cli_80]: ", 10[cli_80]: ) = 10 write(2, "expecting cmd=\"put_ranks2hosts\","..., 44expecting cmd="put_ranks2hosts", got cmd="" ) = 44 write(2, "Fatal error in MPI_Init: Other M"..., 187Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(805): fail failed MPID_Init(1743)......: channel initialization failed MPID_Init(2144)......: PMI_Init returned -1 ) = 187 write(255, "cmd=abort exitcode=68204815\n", 28) = -1 EBADF (Bad file descriptor) write(2, "[cli_80]: ", 10[cli_80]: ) = 10 write(2, "write_line error; fd=255 buf=:cm"..., 60write_line error; fd=255 buf=:cmd=abort exitcode=68204815 : ) = 60 write(2, "system msg for write_line failur"..., 56system msg for write_line failure : Bad file descriptor ) = 56 exit_group(68204815) = ?
All other ranks communicate with pmi_proxy via a valid file descriptor. For example:
<snip> uname({sys="Linux", node="uvtk", ...}) = 0 sched_getaffinity(0, 128, { 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }) = 128 write(16, "cmd=init pmi_version=1 pmi_subve"..., 40) = 40 read(16, "cmd=response_to_init pmi_version"..., 1023) = 57 write(16, "cmd=get_maxes\n", 14) = 14 read(16, "cmd=maxes kvsname_max=256 keylen"..., 1023) = 56 uname({sys="Linux", node="uvtk", ...}) = 0 write(16, "cmd=barrier_in\n", 15) = 15 read(16, <unfinished ...>
Is it possible to specify a list of available file descriptors used by MPI processes or any other way to circumvent this behavior?
Regards,
Pieter