Hi,
Below is a simple reproduction case for the issue we're facing:
#include "stdio.h" #include "mpi.h" #include "stdlib.h" int main(int argc, char* argv[]) { int rank; MPI_Group group; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_group(MPI_COMM_WORLD, &group); if (rank == 0) { printf("rank 0: about to send\n"); MPI_Ssend(NULL, 0, MPI_INT, 1, 0, MPI_COMM_WORLD); printf("rank 0: send completed\n"); } else { MPI_Request req[2]; int which; MPI_Isend(NULL, 0, MPI_INT, 0, 0, MPI_COMM_WORLD, &req[0]); MPI_Irecv(NULL, 0, MPI_INT, 0, 0, MPI_COMM_WORLD, &req[1]); MPI_Waitany(2, req, &which, MPI_STATUS_IGNORE); if (which == 0) { printf("rank 1: send succeeded; cancelling receive request\n"); MPI_Cancel(&req[1]); MPI_Wait(&req[1], MPI_STATUS_IGNORE); } else { printf("rank 1: receive succeeded; cancelling send request\n"); MPI_Cancel(&req[0]); MPI_Wait(&req[0], MPI_STATUS_IGNORE); } } MPI_Finalize(); return 0; }
This program outputs the following, after which it hangs indefinitely:
rank 0: about to send rank 1: send succeeded; cancelling receive request
I understand that this is caused by the "eager completion" of MPI_Isend() on rank 1. Also, I understand that the expected behaviour of a program that initiates an unmatched operation is undefined. However, I don't believe this is the case here, as I do eventually call MPI_Cancel() on the request. If that was not enough, then wouldn't that imply that a program that simply does MPI_Isend(...); MPI_Cancel(...); MPI_Wait(...); is also incorrect?
I also noticed that changing the MPI_Isend() into MPI_Issend() makes the program work as expected:
rank 0: about to send rank 0: send completed rank 1: receive succeeded; cancelling send request
So, to keep it short, my questions are:
- Is the initial (MPI_Isend()) version of my program an incorrect MPI program, whose behaviour is undefined?
- If so, then could you please explain why and point me to the relevant section of the MPI standard or any other resources that would clarify these matters for me?
- Is the MPI_Issend() version of my program also incorrect?
- If MPI_Issend() still doesn't make the program correct, can I at least be sure that, with the Intel implementation, it will always work as expected? Or is it just a coincidence that it does?
Many thanks to anyone willing to help me with this!
- Adrian