Hello, Admin!
I'm now using Intel Cluster Studio Tool Kit! And I'm trying to run hybrid(mpi+openmp) program on 25 compute nodes!I compile my program using with -mt_mpi -openmp. I use I_MPI_DOMAIN=omp OMP_NUM_THREADS=2 environment variables, that means for every process(mpi) will have 2 threads(openmp). I can run my program without errors still using with 14 compute nodes! But beyond 14 compute nodes, error outputs is following!
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)......................:
MPID_Init(195).............................: channel initialization failed
MPIDI_CH3_Init(106)........................:
MPID_nem_tcp_post_init(344)................:
MPID_nem_newtcp_module_connpoll(3099)......:
recv_id_or_tmpvc_info_success_handler(1328): read from socket failed - No error
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(344)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(337)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(337)..........:
MPID_nem_newtcp_module_connpoll(3099):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(337)..........:
MPID_nem_newtcp_module_connpoll(3113):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(337)..........:
MPID_nem_newtcp_module_connpoll(3113):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(337)..........:
MPID_nem_newtcp_module_connpoll(3113):
gen_read_fail_handler(1194)..........: read from socket failed - The specified network name is no longer available.
Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(659)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(106)..................:
MPID_nem_tcp_post_init(337)..........:
MPID_nem_newtcp_module_connpoll(3113):
gen_read_fail_handler(1194)..........: read from socket failed - The specified
job aborted:
rank: node: exit code[: error message]
0: HPC-01: 1: process 0 exited without calling finalize
1: HPC-02: 123
2: HPC-03: 1: process 2 exited without calling finalize
3: HPC-04: 1: process 3 exited without calling finalize
4: HPC-05: 1: process 4 exited without calling finalize
5: HPC-06: 1: process 5 exited without calling finalize
6: HPC-07: 1: process 6 exited without calling finalize
7: HPC-08: 1: process 7 exited without calling finalize
8: HPC-09: 1: process 8 exited without calling finalize
9: HPC-10: 1: process 9 exited without calling finalize
10: HPC-11: 1: process 10 exited without calling finalize
11: HPC-12: 1: process 11 exited without calling finalize
12: HPC-13: 1: process 12 exited without calling finalize
13: HPC-14: 1: process 13 exited without calling finalize
14: HPC-16: 1: process 14 exited without calling finalize
15: HPC-17: 1: process 15 exited without calling finalize
network name is no longer available.