Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 930

mmap() + MPI one-sided communication fails when DAPL UD enabled

$
0
0

 

Hi!

I used a trick in order to read a page located in a remote machine's disk.
(using mmap() over the whole file in each machine and creating MPI_one_sided communication windows on it)

It works fine but it spits the following error messages if I enable DAPL UD by setting 'I_MPI_DAPL_UD=1'.

XXX001:UCM:1d1a:84d2ab40: 271380 us(271380 us):  DAPL ERR reg_mr Cannot allocate memory

[0:XXX001] rtc_register failed 196608 [0] error(0x30000):  unknown error

 

Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0

internal ABORT - process 0

XXX002:UCM:31e2:27bacb40: 263683 us(263683 us):  DAPL ERR reg_mr Cannot allocate memory

[1:XXX002] rtc_register failed 196608 [1] error(0x30000):  unknown error

 

Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0

 

I attach the code and script that I used.

 

#include <stdio.h>

#include <mpi.h>

#include <sys/mman.h>

#include <sys/stat.h>

#include <unistd.h>

#include <fcntl.h>

#include <iostream>

 

#define READ_SIZE (64 * 1024)

 

int main( int argc, char **argv ) {

 

  MPI_Init( &argc, &argv );

 

  int rank, size;

  MPI_Comm_rank( MPI_COMM_WORLD, &rank );

  MPI_Comm_size( MPI_COMM_WORLD, &size );

 

  int file_num = rank + 1;

  char fname[256];

  // below file size is small, only 14MB

  snprintf( fname, 256, "/mnt/sdb1/test_data/%d", file_num );

  int fd = open( fname, O_RDONLY );

 

  struct stat st;

  fstat( fd, &st );

  size_t len = st.st_size;

  void *faddr = mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);

 

  MPI_Win win;

  MPI_Win_create(faddr, len, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &win);

 

  MPI_Barrier(MPI_COMM_WORLD);

 

  int next = (rank + 1) % size;

  char values[READ_SIZE];

  MPI_Win_lock( MPI_LOCK_SHARED, next, 0, win );

  MPI_Get(values, READ_SIZE, MPI_CHAR, next, 0, READ_SIZE, MPI_CHAR, win);

  MPI_Win_unlock( next, win);

 

  if(rank == 0) {

    std::cout << values << std::endl;

    std::cout << std::flush;

  }

 

  MPI_Win_free( &win );

  munmap( faddr, len );

  close( fd );

 

  MPI_Finalize();

 

  return 0;

}

 

and I ran above program with following script:

 

#!/bin/bash

 

export I_MPI_FABRICS=dapl

export I_MPI_DAPL_UD=1

 

mpiexec.hydra  -genvall -machinefile ~/machines -n 2 -ppn 1 ${PWD}/test2

 

 

Experimental Environment:

Hardware Spec:
OS : CentOS 6.4 Final
CPU : 2 * Intel® Xeon® CPU E5-2450 @ (2.10GHz, 8 physical cores)
RAM : 32GB per each
Ethernet: InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] 

Mellanox Infiniband driver: MLNX_OFED_LINUX-3.1-1.1.0.1 (OFED-3.1-1.1.0): 3.19.0

 

thanks,


Viewing all articles
Browse latest Browse all 930

Trending Articles