Hi,
I have the following problem:
I have two nodes and config file:
-n 1 -host node0 myapp -n 1 -host node1 myapp
In this way it works fine. However If I change the order of lines in config to:
-n 1 -host node1 myapp -n 1 -host node0 myapp
It fails with the error:
Fatal error in MPI_Init: Other MPI error, error stack: MPIR_Init_thread(658)................: MPID_Init(195).......................: channel initialization failed MPIDI_CH3_Init(104)..................: MPID_nem_tcp_post_init(344)..........: MPID_nem_newtcp_module_connpoll(3102): gen_cnting_fail_handler(1816)........: connect failed - The semaphore timeout period has expired. (errno 121) job aborted: rank: node: exit code[: error message] 0: node1: 1: process 0 exited without calling finalize 1: node0: 123
What can be the reason for? Any ideas?