where are the others 4 cores?

April 17, 2018, 7:29 am

Latest and popular articles on Intel Technologies

≫ Next: Announcing the Intel® Parallel Studio XE 2019 Beta Program

≪ Previous: Problem with NFS Over RDMA on OmniPath

Hi, Intel support guys,

I am running tests on our SKYLAKE computers. I am surprise to see there are 4 cores/pkg gone. Where are they?

Our computer system information is below:

Process: Intel Xeon Gold 6148 CPU@2.40GHz 2.39GHz (2 processors)

Installed memory: 384GB

System type: 64-bit operating system x64-based processor

OS: Windows server 2016 standard

Please see the following outputs and you will see that 4 cores per package are gone. where are these 8 cores in total?

I am looking forward to hearing from you.

Thanks in advance

Best regards,

Dingjun

Computer Modelling Group Ltd.

Calgary, AB, Canada

VECTOR_SIMD_OPENMP_TEST
OMP: Info #211: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #209: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31}
OMP: Info #156: KMP_AFFINITY: 32 available OS procs
OMP: Info #158: KMP_AFFINITY: Nonuniform topology
OMP: Info #179: KMP_AFFINITY: 2 packages x 20 cores/pkg x 1 threads/core (32 total cores)
OMP: Info #213: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 3
OMP: Info #171: KMP_AFFINITY: OS proc 4 maps to package 0 core 4
OMP: Info #171: KMP_AFFINITY: OS proc 5 maps to package 0 core 8
OMP: Info #171: KMP_AFFINITY: OS proc 6 maps to package 0 core 9
OMP: Info #171: KMP_AFFINITY: OS proc 7 maps to package 0 core 10
OMP: Info #171: KMP_AFFINITY: OS proc 8 maps to package 0 core 11
OMP: Info #171: KMP_AFFINITY: OS proc 9 maps to package 0 core 12
OMP: Info #171: KMP_AFFINITY: OS proc 10 maps to package 0 core 16
OMP: Info #171: KMP_AFFINITY: OS proc 11 maps to package 0 core 17
OMP: Info #171: KMP_AFFINITY: OS proc 12 maps to package 0 core 18
OMP: Info #171: KMP_AFFINITY: OS proc 13 maps to package 0 core 19
OMP: Info #171: KMP_AFFINITY: OS proc 14 maps to package 0 core 20
OMP: Info #171: KMP_AFFINITY: OS proc 15 maps to package 0 core 24
OMP: Info #171: KMP_AFFINITY: OS proc 16 maps to package 0 core 25
OMP: Info #171: KMP_AFFINITY: OS proc 17 maps to package 0 core 26
OMP: Info #171: KMP_AFFINITY: OS proc 18 maps to package 0 core 27
OMP: Info #171: KMP_AFFINITY: OS proc 19 maps to package 0 core 28
OMP: Info #171: KMP_AFFINITY: OS proc 20 maps to package 1 core 0
OMP: Info #171: KMP_AFFINITY: OS proc 21 maps to package 1 core 1
OMP: Info #171: KMP_AFFINITY: OS proc 22 maps to package 1 core 2
OMP: Info #171: KMP_AFFINITY: OS proc 23 maps to package 1 core 3
OMP: Info #171: KMP_AFFINITY: OS proc 24 maps to package 1 core 4
OMP: Info #171: KMP_AFFINITY: OS proc 25 maps to package 1 core 8
OMP: Info #171: KMP_AFFINITY: OS proc 26 maps to package 1 core 9
OMP: Info #171: KMP_AFFINITY: OS proc 27 maps to package 1 core 10
OMP: Info #171: KMP_AFFINITY: OS proc 28 maps to package 1 core 11
OMP: Info #171: KMP_AFFINITY: OS proc 29 maps to package 1 core 12
OMP: Info #171: KMP_AFFINITY: OS proc 30 maps to package 1 core 16
OMP: Info #171: KMP_AFFINITY: OS proc 31 maps to package 1 core 17
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 4004 thread 0 bound to OS proc set {0}
The number of processors available =       32
The number of threads available    =       20
HELLO from process        0
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8956 thread 1 bound to OS proc set {1}
HELLO from process        1
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8820 thread 2 bound to OS proc set {2}
HELLO from process        2
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9292 thread 3 bound to OS proc set {3}
HELLO from process        3
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9752 thread 4 bound to OS proc set {4}
HELLO from process        4
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 3776 thread 5 bound to OS proc set {5}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8464 thread 6 bound to OS proc set {6}
HELLO from process        5
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 1416 thread 7 bound to OS proc set {7}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 3868 thread 8 bound to OS proc set {8}
HELLO from process        6
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 7396 thread 9 bound to OS proc set {9}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9772 thread 10 bound to OS proc set {10}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9280 thread 11 bound to OS proc set {11}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9948 thread 12 bound to OS proc set {12}
HELLO from process        7
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8712 thread 13 bound to OS proc set {13}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 6092 thread 14 bound to OS proc set {14}
HELLO from process       11
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8532 thread 15 bound to OS proc set {15}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9892 thread 16 bound to OS proc set {16}
HELLO from process       12
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 10640 thread 17 bound to OS proc set {17}
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 9060 thread 18 bound to OS proc set {18}
HELLO from process       14
OMP: Info #249: KMP_AFFINITY: pid 10068 tid 8908 thread 19 bound to OS proc set {19}
HELLO from process       18
HELLO from process       16
HELLO from process       19
HELLO from process       13
HELLO from process        8
HELLO from process       17
HELLO from process       15
HELLO from process       10
HELLO from process        9
matrix multiplication completed
Elapsed wall clock time 2 =    133.379

↧

Announcing the Intel® Parallel Studio XE 2019 Beta Program

April 19, 2018, 9:39 am

Latest and popular articles on Intel Technologies

≫ Next: IMPI run error

≪ Previous: where are the others 4 cores?

Join the Intel® Parallel Studio XE 2019 Beta Program today and—for a limited time—get early access to new features and get an open invitation to tell us what you really think.

We want YOU to tell us what to improve so we can create high-quality software tools that meet your development needs.

Top New Features in Intel® Parallel Studio XE 2019 Beta

Scale and perform on the path to exascale. Enable greater scalability and improve latency with the latest Intel^® MPI Library.
Get better answers with less overhead. Focus more fully on useful data, CPU utilization of physical cores, and more using new data-selection support from Intel^® VTune^™ Amplifier’s Application Performance Snapshot.
Visualize parallelism. Interactively build, validate, and visualize algorithms using Intel® Advisor’s Flow Graph Analyzer.
Stay up-to-date with the latest standards:
- Expanded C++17 and Fortran 2018 support
- Full OpenMP* 4.5 and expanded OpenMP 5.0 support
- Python* 3.6 and 2.7

New Features in Intel® MPI Library

Updated architecture to streamline fabric utilization through libfabrics.
Implemented support for Intel® Omni-Path Architecture Multiple Endpoints (Multi-EP)./li>
Cleaned up directory structure.
New format for MPI tuner.
Added impi_info utility as a technical preview feature.
Updated Hydra process manager.

New Features in Intel® Cluster Checker

Simplified execution of Intel® Cluster Checker with a single command.
New ‘-X’ option to get details of data collected and analysis test.
New feature to compare two snapshots of a cluster state to identify changes.
New option to refresh any missing or old data before analysis.
Added auto-node discovery when using SLURM.

To learn more, visit Intel^® Parallel Studio XE 2019 Beta page.

Then sign up to get started.

↧

IMPI run error

April 23, 2018, 9:49 pm

Latest and popular articles on Intel Technologies

≫ Next: Problem compiling with Intel MPI 2018.2 and ifort 15.0.3

≪ Previous: Announcing the Intel® Parallel Studio XE 2019 Beta Program

Dear All,
    I compiled VASP package with IMPI succcessfully, when I run the program, the program stops with 
some MPI errors listed as below.  Could anybody tell me how to fix it. Thanks!

Xiang YE

[0] MPI startup(): Intel(R) MPI Library, Version 2017 Update 4  Build 20170817 (id: 17752)
[0] MPI startup(): Copyright (C) 2003-2017 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[3] MPI startup(): Found 1 IB devices
[5] MPI startup(): Found 1 IB devices
[24] MPI startup(): Found 1 IB devices
[15] MPI startup(): Found 1 IB devices
[25] MPI startup(): Found 1 IB devices
[14] MPI startup(): Found 1 IB devices
[26] MPI startup(): Found 1 IB devices
[9] MPI startup(): Found 1 IB devices
[16] MPI startup(): Found 1 IB devices
[27] MPI startup(): Found 1 IB devices
[7] MPI startup(): Found 1 IB devices
[1] MPI startup(): Found 1 IB devices
[17] MPI startup(): Found 1 IB devices
[18] MPI startup(): Found 1 IB devices
[8] MPI startup(): Found 1 IB devices
[10] MPI startup(): Found 1 IB devices
[19] MPI startup(): Found 1 IB devices
[20] MPI startup(): Found 1 IB devices
[0] MPI startup(): Found 1 IB devices
[11] MPI startup(): Found 1 IB devices
[21] MPI startup(): Found 1 IB devices
[2] MPI startup(): Found 1 IB devices
[22] MPI startup(): Found 1 IB devices
[13] MPI startup(): Found 1 IB devices
[12] MPI startup(): Found 1 IB devices
[23] MPI startup(): Found 1 IB devices
[4] MPI startup(): Found 1 IB devices
[6] MPI startup(): Found 1 IB devices
[0] MPI startup(): Open 0 IB device: mlx5_0
[11] MPI startup(): Open 0 IB device: mlx5_0
[20] MPI startup(): Open 0 IB device: mlx5_0
[2] MPI startup(): Open 0 IB device: mlx5_0
[21] MPI startup(): Open 0 IB device: mlx5_0
[9] MPI startup(): Open 0 IB device: mlx5_0
[3] MPI startup(): Open 0 IB device: mlx5_0
[22] MPI startup(): Open 0 IB device: mlx5_0
[25] MPI startup(): Open 0 IB device: mlx5_0
[1] MPI startup(): Open 0 IB device: mlx5_0
[23] MPI startup(): Open 0 IB device: mlx5_0
[13] MPI startup(): Open 0 IB device: mlx5_0
[4] MPI startup(): Open 0 IB device: mlx5_0
[10] MPI startup(): Open 0 IB device: mlx5_0
[15] MPI startup(): Open 0 IB device: mlx5_0
[26] MPI startup(): Open 0 IB device: mlx5_0
[14] MPI startup(): Open 0 IB device: mlx5_0
[16] MPI startup(): Open 0 IB device: mlx5_0
[27] MPI startup(): Open 0 IB device: mlx5_0
[19] MPI startup(): Open 0 IB device: mlx5_0
[8] MPI startup(): Open 0 IB device: mlx5_0
[18] MPI startup(): Open 0 IB device: mlx5_0
[6] MPI startup(): Open 0 IB device: mlx5_0
[24] MPI startup(): Open 0 IB device: mlx5_0
[7] MPI startup(): Open 0 IB device: mlx5_0
[12] MPI startup(): Open 0 IB device: mlx5_0
[17] MPI startup(): Open 0 IB device: mlx5_0
[5] MPI startup(): Open 0 IB device: mlx5_0
[0] MPI startup(): Start 1 ports per adapter
[20] MPI startup(): Start 1 ports per adapter
[11] MPI startup(): Start 1 ports per adapter
[9] MPI startup(): Start 1 ports per adapter
[3] MPI startup(): Start 1 ports per adapter
[21] MPI startup(): Start 1 ports per adapter
[2] MPI startup(): Start 1 ports per adapter
[1] MPI startup(): Start 1 ports per adapter
[25] MPI startup(): Start 1 ports per adapter
[22] MPI startup(): Start 1 ports per adapter
[23] MPI startup(): Start 1 ports per adapter
[4] MPI startup(): Start 1 ports per adapter
[10] MPI startup(): Start 1 ports per adapter
[15] MPI startup(): Start 1 ports per adapter
[13] MPI startup(): Start 1 ports per adapter
[26] MPI startup(): Start 1 ports per adapter
[14] MPI startup(): Start 1 ports per adapter
[27] MPI startup(): Start 1 ports per adapter
[16] MPI startup(): Start 1 ports per adapter
[12] MPI startup(): Start 1 ports per adapter
[18] MPI startup(): Start 1 ports per adapter
[24] MPI startup(): Start 1 ports per adapter
[6] MPI startup(): Start 1 ports per adapter
[19] MPI startup(): Start 1 ports per adapter
[8] MPI startup(): Start 1 ports per adapter
[5] MPI startup(): Start 1 ports per adapter
[17] MPI startup(): Start 1 ports per adapter
[7] MPI startup(): Start 1 ports per adapter
[11] MPID_nem_ofacm_init(): Init
[0] MPID_nem_ofacm_init(): Init
[20] MPID_nem_ofacm_init(): Init
[9] MPID_nem_ofacm_init(): Init
[3] MPID_nem_ofacm_init(): Init
[21] MPID_nem_ofacm_init(): Init
[2] MPID_nem_ofacm_init(): Init
[1] MPID_nem_ofacm_init(): Init
[22] MPID_nem_ofacm_init(): Init
[25] MPID_nem_ofacm_init(): Init
[23] MPID_nem_ofacm_init(): Init
[11] MPI startup(): ofa data transfer mode
[0] MPI startup(): ofa data transfer mode
[20] MPI startup(): ofa data transfer mode
[4] MPID_nem_ofacm_init(): Init
[10] MPID_nem_ofacm_init(): Init
[26] MPID_nem_ofacm_init(): Init
[14] MPID_nem_ofacm_init(): Init
[15] MPID_nem_ofacm_init(): Init
[13] MPID_nem_ofacm_init(): Init
[27] MPID_nem_ofacm_init(): Init
[16] MPID_nem_ofacm_init(): Init
[12] MPID_nem_ofacm_init(): Init
[9] MPI startup(): ofa data transfer mode
[18] MPID_nem_ofacm_init(): Init
[3] MPI startup(): ofa data transfer mode
[8] MPID_nem_ofacm_init(): Init
[24] MPID_nem_ofacm_init(): Init
[19] MPID_nem_ofacm_init(): Init
[5] MPID_nem_ofacm_init(): Init
[21] MPI startup(): ofa data transfer mode
[17] MPID_nem_ofacm_init(): Init
[1] MPI startup(): ofa data transfer mode
[7] MPID_nem_ofacm_init(): Init
[2] MPI startup(): ofa data transfer mode
[6] MPID_nem_ofacm_init(): Init
[22] MPI startup(): ofa data transfer mode
[25] MPI startup(): ofa data transfer mode
[23] MPI startup(): ofa data transfer mode
[10] MPI startup(): ofa data transfer mode
[14] MPI startup(): ofa data transfer mode
[15] MPI startup(): ofa data transfer mode
[26] MPI startup(): ofa data transfer mode
[4] MPI startup(): ofa data transfer mode
[13] MPI startup(): ofa data transfer mode
[27] MPI startup(): ofa data transfer mode
[16] MPI startup(): ofa data transfer mode
[12] MPI startup(): ofa data transfer mode
[18] MPI startup(): ofa data transfer mode
[24] MPI startup(): ofa data transfer mode
[19] MPI startup(): ofa data transfer mode
[8] MPI startup(): ofa data transfer mode
[5] MPI startup(): ofa data transfer mode
[6] MPI startup(): ofa data transfer mode
[7] MPI startup(): ofa data transfer mode
[17] MPI startup(): ofa data transfer mode
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(805): fail failed
MPID_Init(1866)......: fail failed
MPIR_Comm_commit(711): fail failed
(unknown)(): Other MPI error

↧

Problem compiling with Intel MPI 2018.2 and ifort 15.0.3

May 6, 2018, 8:46 pm

Latest and popular articles on Intel Technologies

≫ Next: Support for NVIDIA GPUdirect RDMA?

≪ Previous: IMPI run error

Hi Everyone,

I just installed the newest version of Intel MPI library (2018.2.199), previously I was using openMPI. I am using ifort 15.0.3.
I am trying to compile the following test program:

program main
    use mpi_f08
    implicit none
    integer :: rank, size, len
    character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version

    call MPI_INIT()
    call MPI_COMM_RANK(MPI_COMM_WORLD, rank)
    call MPI_COMM_SIZE(MPI_COMM_WORLD, size)
    call MPI_GET_LIBRARY_VERSION(version, len)

    print *, "rank:", rank
    print *, "size:",size
    print *, "version: "//version
    print *, ' No Errors'

    call MPI_FINALIZE()
end

When I use openMPI it works fine. However, I am getting the following errors with Intel MPI:

% mpiifort test_F08.f90 
test_F08.f90(2): error #7012: The module file cannot be read.  Its format requires a more recent F90 compiler.   [MPI_F08]
    use mpi_f08
--------^
test_F08.f90(8): error #6404: This name does not have a type, and must have an explicit type.   [MPI_COMM_WORLD]
    call MPI_COMM_RANK(MPI_COMM_WORLD, rank)
-----------------------^
test_F08.f90(5): error #6279: A specification expression object must be a dummy argument, a COMMON block object, or an object accessible through host or use association.   [MPI_MAX_LIBRARY_VERSION_STRING]
    character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version
------------------^
test_F08.f90(5): error #6591: An automatic object is invalid in a main program.   [VERSION]
    character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: version
-----------------------------------------------------^
compilation aborted for test_F08.f90 (code 1)

So, do I have to use the same version of ifort that was used during compilation of Intel MPI modules? That is not listed in the requirements to use intel MPI.
Why does Intel MPI not create new modules files using the Fortran compiler available in the system?
Is there anything that I can do to use Intel MPI with my compiler?

Thanks for your help,

Hector

↧

Support for NVIDIA GPUdirect RDMA?

May 8, 2018, 2:31 pm

Latest and popular articles on Intel Technologies

≫ Next: Interested in buying a "used" cluster edition compiler for linux

≪ Previous: Problem compiling with Intel MPI 2018.2 and ifort 15.0.3

Does Intel MPI support GPUdirect RDMA, with NVIDIA drivers and Cudatoolkit 9.x installed?

Is there any documentation on what drivers to install, and what fabric select env vars to set?

Thanks

Ron

↧

Interested in buying a "used" cluster edition compiler for linux

May 10, 2018, 6:56 am

Latest and popular articles on Intel Technologies

≫ Next: adding further compute nodes

≪ Previous: Support for NVIDIA GPUdirect RDMA?

My understanding is that Intel allows one to sell and transfer your license to someone else. I am a small scale open-source developer and can't afford the price for the latest cluster edition linux compilers. Send me a note if you have an older version you wouldn't mind transferring to me. For my needs anything 2015 and newer would suffice.

↧

adding further compute nodes

May 16, 2018, 6:12 am

Latest and popular articles on Intel Technologies

≫ Next: Visual Studio project settings to instrument for Trace Analyzer

≪ Previous: Interested in buying a "used" cluster edition compiler for linux

Hi,

Is there a need to re-install Intel Studio even in the case then I added further compute nodes to my cluster? There exists two infiniband -islands, ibstat is:

CA 'mlx4_0', CA type: MT4099 and CA 'mlx4_1', CA type: MT26428. The latest compute nodes are associated to MT4099.

These provider -errors are only present in the 'newer node context'

[2] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
[10] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
[12] MPI startup(): dapl fabric is not available and fallback fabric is not enabled
node009:UCM:2d97:570fa700: 1249 us(1249 us): open_hca: device mlx4_0 not found
node009:UCM:2d9f:1626f700: 1262 us(1262 us): open_hca: device mlx4_0 not found
node009:UCM:2da1:7f214700: 1102 us(1102 us): open_hca: device mlx4_0 not found

Regards

Gert

Attachment	Size
Download ib_provider.txt	5.87 KB

↧

Visual Studio project settings to instrument for Trace Analyzer

May 16, 2018, 6:09 pm

Latest and popular articles on Intel Technologies

≫ Next: Issue with MPI_Sendrecv

≪ Previous: adding further compute nodes

Hello,

I'm just getting started with Intel MPI and am trying to understand how to use Trace Analyzer. My understanding is that linking with vt.lib and running an mpi application is sufficient to cause a *.stf file to be emitted. I have a simple Hello World MPI application. After linking with vt.lib and running through mpiexec, I see no stf output.

There's not much more information to add. The setup could not be simpler. What am I missing?

Jeff

↧

Issue with MPI_Sendrecv

May 21, 2018, 7:25 am

Latest and popular articles on Intel Technologies

≫ Next: mpirun: unexpected disconnect completion event

≪ Previous: Visual Studio project settings to instrument for Trace Analyzer

Hello,

I am experiencing issues while using MPI_Sendrecv on multiple machines. In the code I am sending a vector in the circular manner in parallel. Each process is sending data to the subsequent process and receiving data from preceding process. Surprisingly, in the first execution of SEND_DATA routine the output is correct. While for the second execution the output is incorrect. The code and the output are below.

PROGRAM SENDRECV_REPROD
USE MPI
USE ISO_FORTRAN_ENV,ONLY: INT32
IMPLICIT NONE
INTEGER(KIND=INT32) :: STATUS(MPI_STATUS_SIZE) 
INTEGER(KIND=INT32) :: RANK,NUM_PROCS,IERR

CALL MPI_INIT(IERR)
CALL MPI_COMM_RANK(MPI_COMM_WORLD,RANK,IERR)
CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NUM_PROCS,IERR)

CALL SEND_DATA(RANK,NUM_PROCS)
CALL SEND_DATA(RANK,NUM_PROCS)

CALL MPI_BARRIER(MPI_COMM_WORLD,IERR)  
CALL MPI_FINALIZE(IERR)

END PROGRAM

SUBROUTINE SEND_DATA(RANK,NUM_PROCS)
USE ISO_FORTRAN_ENV,ONLY: INT32,REAL64
USE MPI
IMPLICIT NONE
INTEGER(KIND=INT32),INTENT(IN) :: RANK
INTEGER(KIND=INT32),INTENT(IN) :: NUM_PROCS
INTEGER(KIND=INT32) :: IERR,ALLOC_ERROR
INTEGER(KIND=INT32) :: VEC_SIZE,I_RANK,RANK_DESTIN,RANK_SOURCE,TAG_SEND,TAG_RECV
REAL(KIND=REAL64), ALLOCATABLE :: COMM_BUFFER(:),VEC1(:)
INTEGER(KIND=INT32) :: MPI_COMM_STATUS(MPI_STATUS_SIZE) 



! Allocate communication arrays.

VEC_SIZE = 374454
ALLOCATE(COMM_BUFFER(VEC_SIZE),STAT=ALLOC_ERROR)
ALLOCATE(VEC1(VEC_SIZE),STAT=ALLOC_ERROR)



! Define destination and source ranks for sending and receiving messages.

RANK_DESTIN = MOD((RANK+1),NUM_PROCS)
RANK_SOURCE = MOD((RANK+NUM_PROCS-1),NUM_PROCS)

TAG_SEND = RANK+1
TAG_RECV = RANK
IF (RANK==0) TAG_RECV=NUM_PROCS

VEC1=RANK
COMM_BUFFER=0.0_REAL64
        
    
CALL MPI_BARRIER(MPI_COMM_WORLD,IERR)
        
DO I_RANK=1,NUM_PROCS
    IF (RANK==I_RANK-1) WRITE(*,*) 'R',RANK, VEC1(1),'B', COMM_BUFFER(1)
ENDDO

CALL MPI_SENDRECV(VEC1(1),VEC_SIZE,MPI_DOUBLE_PRECISION,RANK_DESTIN,TAG_SEND,COMM_BUFFER(1),&
                    VEC_SIZE,MPI_DOUBLE_PRECISION,RANK_SOURCE,TAG_RECV,MPI_COMM_WORLD,MPI_COMM_STATUS,IERR)
        
DO I_RANK=1,NUM_PROCS
    IF (RANK==I_RANK-1) WRITE(*,*) 'R' ,  RANK , VEC1(1),'A', COMM_BUFFER(1)
ENDDO



END SUBROUTINE SEND_DATA

Output of four processes run on four machines:

R 0 0.000000000000000E+000 B 0.000000000000000E+000

R 1 1.00000000000000 B 0.000000000000000E+000

R 2 2.00000000000000 B 0.000000000000000E+000

R 3 3.00000000000000 B 0.000000000000000E+000

R 0 0.000000000000000E+000 A 3.00000000000000

R 1 1.00000000000000 A 0.000000000000000E+000

R 2 2.00000000000000 A 1.00000000000000

R 3 3.00000000000000 A 2.00000000000000

R 0 0.000000000000000E+000 B 0.000000000000000E+000

R 1 1.00000000000000 B 0.000000000000000E+000

R 2 2.00000000000000 B 0.000000000000000E+000

R 3 3.00000000000000 B 0.000000000000000E+000

R 0 0.000000000000000E+000 A 2.00000000000000

R 1 1.00000000000000 A 3.00000000000000

R 2 2.00000000000000 A 0.000000000000000E+000

R 3 3.00000000000000 A 1.00000000000000

As you see the output of first SEND_DATA execution is different from the second. The results are correct if I run the reproducer on single machine with multiple processes. I am compiling the code with: mpiifort for the Intel(R) MPI Library 2017 Update 3 for Linux* ifort version 17.0.4

and running with mpirun version Intel(R) MPI Library for Linux* OS, Version 2017 Update 3 Build 20170405.

Do you have any idea what could be a source of this issue?

Thank you,
Piotr

↧

mpirun: unexpected disconnect completion event

May 22, 2018, 2:50 pm

Latest and popular articles on Intel Technologies

≫ Next: Suport for mpi_f08 Fortran module in MPI 2019.0.045 beta

≪ Previous: Issue with MPI_Sendrecv

Hi,

I've been running on 5 (distributed memory) nodes (each has 20 processors) by using mpirun -n 5 -ppn 1 -hosts nd1,nd2,nd3,nd4,nd5.

Sometimes it works, sometimes, it gives inaccurate results, and sometimes it crashes with the error:

"[0:nd1] unexpected disconnect completion event from [35:nd2] Fatal error in PMPI_Comm_dup: Internal MPI error!, error stack ...".

Any suggestion to fix this communication error while running on multiple nodes with mpi (2017 update 2)?

I already set the stacksize to unlimited in my .rc. file. I tested this for two different applications (one is the famous distributed-memory solver, MUMPS). I have the same issue with both. This is not a very memory-demanding job. mpirun works perfectly on 1 node, this only happens on multiple nodes (even 2).

Thanks

↧

Suport for mpi_f08 Fortran module in MPI 2019.0.045 beta

May 23, 2018, 12:20 pm

Latest and popular articles on Intel Technologies

≫ Next: Trace Collector + Fortran 2008

≪ Previous: mpirun: unexpected disconnect completion event

Hi!

I am testing Intel Parallel Studio 2019.0.045 beta for windows. The Intel MPI library that comes with it does not support the Fortran module mpi_f08. Whereas the the Linux version provides such module. Why does this module is not supported in Windows?

Are you planning to support the mpi_f08 module for Windows in the future?

Thanks for your help,

Hector

↧

Trace Collector + Fortran 2008

May 24, 2018, 4:45 am

Latest and popular articles on Intel Technologies

≫ Next: IMPI w/ Slurm

≪ Previous: Suport for mpi_f08 Fortran module in MPI 2019.0.045 beta

Hi,

I have observed that when trying to trace the following program with mpiexec -trace everything work fine
as long as I stick with "use mpi". If I change that to "use mpi_f08" I do not get a tracefile.
The reason I'm interested in using mpi_f08 is because I have an application to trace that uses
the shared memory MPI model and it seems that the call to

MPI_Comm_split_type

that is used below is only possible with the mpi_f08 module, right?

Any hints on why I cannot trace that program when using "use mpi_f08"?

Some extra Info:

$ mpiifort -o shm shm.f90
$ mpiifort --version
ifort (IFORT) 18.0.2 20180210
$ mpiexec -trace -np 4 shm

program nicks_program

   ! use mpi_f08
   use mpi 

   implicit none

   integer :: wrank, wsize, sm_rank, sm_size, ierr, send
   type(MPI_COMM) :: MPI_COMM_SHARED 

   call MPI_Init(ierr)
   call MPI_comm_rank(MPI_COMM_WORLD, wrank, ierr)
   call MPI_comm_size(MPI_COMM_WORLD, wsize, ierr)

   ! call MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, 0, MPI_INFO_NULL, MPI_COMM_SHARED, ierr)
   send = wrank


   call MPI_Bcast( send, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr )
   ! call MPI_Bcast( send, 1, MPI_INTEGER, 0, MPI_COMM_SHARED, ierr )

   write(*,*) 'send = ', send
   write(*,*) 'ierr = ', ierr

   call MPI_Finalize(ierr)
end

↧

IMPI w/ Slurm

May 25, 2018, 9:46 am

Latest and popular articles on Intel Technologies

≫ Next: Cannot use MPI 2019.0.045 beta

≪ Previous: Trace Collector + Fortran 2008

I'm working at a site configured with IMPI (2016.4.072) / Slurm (17.11.4). The MpiDefault is none.

When I run my MPICH2 code (defaulting to --mpi=none)

srun -N 2 -n 4 -l -vv ...

I get (trimming out duplicate error messages from other ranks)

0: PMII_singinit: execv failed: No such file or directory

0: [unset]: This singleton init program attempted to access some feature

0: [unset]: for which process manager support was required, e.g. spawn or universe_size.

0: [unset]: But the necessary mpiexec is not in your path.

0: [unset]: write_line error; fd=-1 buf=:cmd=get kvsname=singinit_kvs_18014_0 key=P2-hostname

0: :

0: system msg for write_line failure : Bad file descriptor

0: [unset]: write_line error; fd=-1 buf=:cmd=get kvsname=singinit_kvs_18014_0 key=P3-hostname

0: :

0: system msg for write_line failure : Bad file descriptor

0: 2018-05-25 09:00:14 2: MPI startup(): Multi-threaded optimized library

0: 2018-05-25 09:00:14 2: DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx4_0-1u

0: 2018-05-25 09:00:14 2: MPI startup(): DAPL provider ofa-v2-mlx4_0-1u

0: 2018-05-25 09:00:14 2: MPI startup(): shm and dapl data transfer modes

0: [unset]: write_line error; fd=-1 buf=:cmd=get kvsname=singinit_kvs_18417_0 key=P1-businesscard-0

0: :

0: system msg for write_line failure : Bad file descriptor

0: [unset]: write_line error; fd=-1 buf=:cmd=get kvsname=foobar key=foobar

0: :

0: system msg for write_line failure : Bad file descriptor

0: [unset]: write_line error; fd=-1 buf=:cmd=get kvsname=singinit_kvs_18417_0 key=P1-businesscard-0

0: :

0: system msg for write_line failure : Bad file descriptor

0: Fatal error in PMPI_Init_thread: Other MPI error, error stack:

0: MPIR_Init_thread(784).................:

0: MPID_Init(1332).......................: channel initialization failed

0: MPIDI_CH3_Init(141)...................:

0: dapl_rc_setup_all_connections_20(1388): generic failure with errno = 872614415

0: getConnInfoKVS(849)...................: PMI_KVS_Get failed

If I run the same code with

srun --mpi=pmi2 ...

it works fine.

A couple of questions/comments:

1. In neither case do I set I_MPI_PMI_LIBRARY, which I thought I needed to -- how else does IMPI find the Slurm PMI? This might be why --mpi=none is failing, but for the moment, I can't set the variable because I can't find libpmi[1,2,x].so.

2. I would think that since none is the default, it should work. Under what conditions would none fail, but pmi2 work? Is it because IMPI supports pmi2?

3. If I do need to set I_MPI_PMI_LIBRARY, why does pmi2 still work without setting I_MPI_PMI_LIBRARY? Or do I not need to set it when using IMPI?

4. I'm still trying to understand a bit more of the correlation between libpmi.so and mpi_*.so. libpmi.so is the Slurm PMI library, correct? And mpi_* are the Slurm plug-in libraries (e.g. mpi_none, mpi_pmi2, etc.). How do these libraries fit together?

Thanks,

Raymond

↧

Cannot use MPI 2019.0.045 beta

May 25, 2018, 5:08 pm

Latest and popular articles on Intel Technologies

≫ Next: Error when opening command prompt with Intel compiler

≪ Previous: IMPI w/ Slurm

Hi there!

I am unable to use Intel MPI library from Parallel Studio XE beta 2019 in Windows.

I am trying to compile the following code with mpicc:
http://people.sc.fsu.edu/~jburkardt/c_src/hello_mpi/hello_mpi.c

It seems to compile ok.

However, when I run it with mpiexec there is no output.
mpiexec -n 1 hello_mpi.exe

I don't have this problem with Intel MPI 2018.

Thanks for your help,

Hector

↧

Error when opening command prompt with Intel compiler

June 4, 2018, 3:44 am

Latest and popular articles on Intel Technologies

≫ Next: MPI_File_get_size, max file limit on windows 10

≪ Previous: Cannot use MPI 2019.0.045 beta

After installing IPS XE 2019 Beta, I am experiencing a problem with IPS XE 2017 Update 2 when running "Intel 64 Visual Studio 2015 environment". I get error "The application was unable to start correctly (0xc0000005). Click OK to close the application." in a window titled "fi_info.exe - Application Error". The corresponding command prompt window I am opening is titled "Intel(R) MPI Library 2019 Pre-Release (Beta) for Windows* Target Build Environment for Intel(R) 64 applications" and the same text is also in the command prompt window. After I click OK, the command prompt window contains following:

Intel(R) MPI Library 2019 Pre-Release (Beta) for Windows* Target Build Environment for Intel(R) 64 applications
Copyright 2007-2018 Intel Corporation.

Why would MPI 2019 be used in IPS XE 2017?

The same happens when I try running "Intel 64 Visual Studio 2015 environment" in IPS XE 2019 Beta instead of in IPS XE 2017 Update 2.

Can I ignore the problem? I am not using MPI in my applications.

↧

MPI_File_get_size, max file limit on windows 10

June 7, 2018, 8:55 am

Latest and popular articles on Intel Technologies

≫ Next: Mpi_comm_spawn with large number of children hangs at mpi_init

≪ Previous: Error when opening command prompt with Intel compiler

It seems the max file size returned is limited to a 4 byte unsigned integer, even though MPI_Offset is 8 bytes. The following program fails in MPI_File_get_size for files larger than 4GB. Is there a way around this?

This is for Windows 10, mpicc.bat for the Intel(R) MPI Library 2018 Update 2 for Windows*
Copyright 2007-2018 Intel Corporation.

Attached program shows the issue.

Attachment	Size
Download write.c	792 bytes

↧

Mpi_comm_spawn with large number of children hangs at mpi_init

June 13, 2018, 4:24 pm

Latest and popular articles on Intel Technologies

≫ Next: Fatal error in PMPI_Type_size: Invalid datatype, error stack:

≪ Previous: MPI_File_get_size, max file limit on windows 10

Hello,

I have a Fortran 90 mpi program running in on a linux cluster, with intel/2018.0.2 and intelmpi/2018.0.2 compilers, which uses MPI_COMM_SPAWN to spawn 1 child process of a C++ mpi program per parent process. The idea is that the parent processes are mapped evenly across the nodes, each of which spawns a child, waits for a blocking send/recv from it to signal completion, and then goes on to work with the output of the child.

Here is the command I use to spawn call the children:

call MPI_COMM_SPAWN('MUSIC', argv, 1, info, 0, &
        MPI_COMM_SELF, MPI_COMM_CHILD, MPI_ERRCODES_IGNORE, ierr)

So maxprocs=1 process is spawned by each parent, using its own communicator, concurrently by all the parent processes (or whenever they reach this call).

I have tested the code and it works for 8 processes (8 parent + 8 child = 16 total), spread over 2 nodes. I'm now trying to scale up to 128 processes spread over 32 nodes, but all of the children processes are hanging (I think) at Mpi_Init(). I can top on the nodes and see that they (the correct number of them) are running, so they have been spawned, but aren't progressing through the program.

Here is the tail of stdout with I_MPI_DEBUG=10:

[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx5_0-1u
[0] DAPL startup(): trying to open DAPL provider from I_MPI_DAPL_PROVIDER: ofa-v2-mlx5_0-1u
[0] I_MPI_dlopen_dat(): trying to load default dat library: libdat2.so.2
[0] I_MPI_dlopen_dat(): trying to load default dat library: libdat2.so.2
[0] MPI startup(): DAPL provider ofa-v2-mlx5_0-1u
[0] MPI startup(): DAPL provider ofa-v2-mlx5_0-1u
[0] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): shm and dapl data transfer modes
[0] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
[0] MPI startup(): DAPL provider ofa-v2-mlx5_0-1u
[0] MPI startup(): DAPL provider ofa-v2-mlx5_0-1u
[0] MPI startup(): shm and dapl data transfer modes
[0] MPI startup(): shm and dapl data transfer modes
[0] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): User set DAPL collective mask = 0000
[0] MPID_nem_init_dapl_coll_fns(): Effective DAPL collective mask = 0000

This is what is suggests to me the children are hanging at either startup or Mpi_Init(), since these are some, but not all of the "MPI startup():" messages they should produce with a successful start up. By inspection of the successful start ups, after the above, there should be some messages about the cores on each node and then:

[0] MPI startup(): I_MPI_INFO_CACHE3=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,\
1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_CACHES=3
[0] MPI startup(): I_MPI_INFO_CACHE_SHARE=2,2,64
[0] MPI startup(): I_MPI_INFO_CACHE_SIZE=32768,1048576,28835840
[0] MPI startup(): I_MPI_INFO_CORE=0,1,2,3,4,8,9,10,11,12,16,17,18,19,20,24,25,26,27,28,0,1,2,3,4,8,9,10,11,12,16\
,17,18,19,20,24,25,26,27,28,0,1,2,3,4,8,9,10,11,12,16,17,18,19,20,24,25,26,27,28,0,1,2,3,4,8,9,10,11,12,16,17,18,\
19,20,24,25,26,27,28
[0] MPI startup(): I_MPI_INFO_C_NAME=Unknown
[0] MPI startup(): I_MPI_INFO_DESC=1342177280
[0] MPI startup(): I_MPI_INFO_FLGB=-744488965
[0] MPI startup(): I_MPI_INFO_FLGC=2147417079
[0] MPI startup(): I_MPI_INFO_FLGCEXT=8
[0] MPI startup(): I_MPI_INFO_FLGD=-1075053569
[0] MPI startup(): I_MPI_INFO_FLGDEXT=201326592
[0] MPI startup(): I_MPI_INFO_LCPU=80
[0] MPI startup(): I_MPI_INFO_MODE=775
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=mlx5_0:0
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] MPI startup(): I_MPI_INFO_PACK=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,\
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_SIGN=329300
[0] MPI startup(): I_MPI_INFO_STATE=0
[0] MPI startup(): I_MPI_INFO_THREAD=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,\
0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_VEND=1
[0] MPI startup(): I_MPI_PIN_INFO=x0,1,2,3,4,5,6,7,8,9,40,41,42,43,44,45,46,47,48,49
[0] MPI startup(): I_MPI_PIN_MAPPING=4:0 0,1 10,2 20,3 30

which are the last messages produced by the successful start up of the parent processes (similarly by the children in the 8 process case).

There is another thread https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technolog... where on windows there was trouble with large number of child processes, and they had some success switching impi.dll to the debug version, although they were observing an outright crash and not a hang.

Any help/suggestions on how to debug greatly appreciated.

↧

Fatal error in PMPI_Type_size: Invalid datatype, error stack:

June 20, 2018, 3:57 am

Latest and popular articles on Intel Technologies

≫ Next: Compile Error with 2018.3.222 version mpiicc

≪ Previous: Mpi_comm_spawn with large number of children hangs at mpi_init

I am trying to use -trace flag to get .stf output file for traceanalyzer. I run my job using this script:

#!/bin/bash -l
#PBS -l nodes=2:ppn=40,walltime=00:10:00
#PBS -N GranularGas
#PBS -o granularjob.out -e granularjob.err

export MPIRUN=/apps/intel/ComposerXE2018/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin/mpirun
export CODEPATH=${HOME}/GranularGas/1.1_parallel_GranularGas/build
source /apps/intel/ComposerXE2018/itac/2018.2.020/intel64/bin/itacvars.sh

cd ${CODEPATH}
${MPIRUN} -trace ${CODEPATH}/GranularGas

After submitting my job, I get the following error:

Fatal error in PMPI_Type_size: Invalid datatype, error stack:
PMPI_Type_size(131): MPI_Type_size(INVALID DATATYPE) failed
PMPI_Type_size(76).: Invalid datatype

and I get a ".prot" file. Where this error come from? How can I fix it?

For more information I am using Intel compiler 18.0.2 and Intel MPI 20180125.

↧

Compile Error with 2018.3.222 version mpiicc

June 20, 2018, 10:58 pm

Latest and popular articles on Intel Technologies

≫ Next: Zero-sized .stf file generated from ITAC

≪ Previous: Fatal error in PMPI_Type_size: Invalid datatype, error stack:

Not sure if this is the right place to ask. Forgive me if I should ask someplace else.

I tried to compile the EPCC OpenMP/MPI benchmark with Intel Tools 2018.3.222 version, it failed with this error:

(The source code can be downloaded from here: https://www.epcc.ed.ac.uk/research/computing/performance-characterisatio...)

mpiicc -qopenmp -O3 -o mixedModeBenchmark parallelEnvironment.o benchmarkSetup.o output.o pt_to_pt_pingpong.o pt_to_pt_pingping.o pt_to_pt_multiPingpong.o pt_to_pt_multiPingping.o pt_to_pt_haloexchange.o collective_barrier.o collective_broadcast.o collective_scatterGather.o collective_reduction.o collective_alltoall.o mixedModeBenchmarkDriver.o

benchmarkSetup.o:(.bss+0x0): multiple definition of `myThreadID'