>> Japanese


Communication and Synchronization

Global-view model

reduction directive

The reduction construct performs a reduction operation among nodes.

[F] !$xmp reduction ( reduction-kind : variable [, variable ]... ) [on node-ref | template-ref] [async ( async-id )]
[C] #pragma xmp reduction ( reduction-kind : variable [, variable ]... ) [on node-ref | template-ref] [async ( async-id )]

  • The variables specified by the sequence of variable’s must either not be aligned or be replicated among nodes of the node set specified by the on clause.
  • reduction-kind can use the following.

    [F] +, *, -, .and., .or., .eqv., .neqv., max, min, iand, ior, ieor
    [C] +, *, -, &, |, ^, &&, ||, max, min

bcast directive

The bcast construct performs broadcast communication from a specified node.

[F] !$xmp bcast ( variable [, variable]... ) [from nodes-ref | template-ref] [on nodes-ref] | template-ref] [async ( async-id )]
[C] #pragma xmp bcast  ( variable [, variable]... ) [from nodes-ref | template-ref] [on nodes-ref] | template\
-ref] [async ( async-id )]

  • It specifies the transfer source with the from clause. When omitted, p(1) is assigned.
  • It specifies the transfer destination with the on clause. When omitted, the transfer is to all nodes.
  • The destination node set specified in the on clause must include the source node specified in the on clause.
  • Example 1 : The local variable a is broadcast to all nodes.

    #pragma xmp bcast (a)

  • Example 2 : The local variable a held by node p(5) is transferred from node p(5) to node p(7).

    #pragma xmp bcast (a) from p(5) on p(5:7)

barrier directive

The barrier construct specifies an explicit barrier at the point at which the construct appears.

[F] $!xmp barrier [on nodes-ref | template-ref]
[C] #pragma xmp barrier [on nodes-ref | template-ref]

  • If the on clause is omitted, barrier synchronization occurs for all execution nodes.
  • Example 1 : all nodes wait until function func_a() ends.

    func_a();
    #pragma xmp barrier
    func_b();

  • Example 2 : only the nodes that have indices 10 through 20 of template t will be subject to barrier synchronization.

    #pragma xmp barrier on t(10:20)

shadow directive

The shadow directive allocates the shadow area for a distributed array.

[F] !$xmp shadow array-name ( shadow-width [, shadow-width]... )
[C] #pragma xmp shadow array-name ( shadow-width [, shadow-width]... )

where shadow-width must be one of:
   int-expr
   int-expr : int-expr
   *

  • shadow-width is the size of the reference area.
  • int-expr has an integer value of 0 or larger.
  • int-expr defines a shadow area with the same width at the upper and lower bounds of the dimension in question.
  • int-expr:int-expr defines a shadow area with a different width at the upper and lower bound.
  • defines the entire array region will be the shadow area.

reflect directive

The reflect construct assigns the value of a reflection source to the corresponding shadow object.

[F] !$xmp reflect ( array-name [, array-name]... ) [width ( reflect-width [, reflect-width]... )] [async ( async-id )]
[C] #pragma xmp reflect array-name [, array-name]... ) [width ( reflect-width [, reflect-width]... )] [async ( async-id )]
  • Example :

    #pragma xmp template t(0:8)
    #pragma xmp nodes p(3)
    int a[9], b[9];
    #pragma xmp align a[i] with t(i)
    #pragma xmp align b[i] with t(i)
    #pragma xmp shadow a[1]
    
    #pragma xmp loop on t(i)
    for(i=0;i<9;i++)
       a[i] = init(i);     // a[]の初期化
    
    #pragma xmp reflect (a)  // 同期
    
    #pragma xmp loop on t(i)
    for(i=1;i<8;i++)
       b[i] = a[i-1] + a[i] + a[i+1];

    The shadow directive creates a shadow area at the upper and lower bounds of array a[]. The gray area is the reference element that is created.

    The reflect directive synchronizes the shadow area. The directive generates communication between adjacent nodes.

  • When m-dimensions of n-dimensions array are distributed (n>m), shadow directive declares that no-distributed dimensions do not have sleeve area by inserting "0".

    #pragma xmp template t(0:9)
    #pragma xmp nodes p(*)
    int a[10][20][30];
    #pragma xmp align a[i][*][*] with t(i)
    #pragma xmp shadow a[1][0][0]

gmove directive

The gmove construct allows an assignment statement, which may cause communication, to be executed possibly in parallel by the executing nodes.

[F] !$xmp gmove [in | out] [async ( async-id )]
[C] #pragma xmp gmove [in | out] [async ( async-id )]

If neither in nor out is specified, the data referenced on the left and right sides must be located in the current node set. The data on the right side are transferred from the node containing that data and are received by and assigned to the node on the left side.

  • A distributed array between nodes, a local array or a local scalar variable is assigned.
  • The assignment statement is limited to simple assignment, without any arithmetic operations.
  • In XMP/C language, A[n:m] means that m elements from A[n].
  • In XMP/Fortran language, A(n:m) means that elements through n to m of array A.
  • If the right side is a local variable, the data must have the same value at all the nodes.
  • If the left side is a local variable, the same value is assigned (it is equivalent to a broadcast operation)
  • If both the right and left sides are distributed arrays, all-to-all communication is performed. If the left side is a duplicate array, this operation becomes a multicast.
  • If the right side is a distributed array, this becomes a broadcast communication from the specific node.
  • Example 1: Assignment statement 1: scalar variables

    #pragma xmp gmove
      s1 = s2              // s1 and s2 are scalar variables

  • Example 2: Assignment statement 2: using scalar variables

    #pragma xmp gmove
      a[3] = b[i][j]      // a and b are local arrays

  • Example 3: Assignment statement 1: using distributed arrays

    #pragma xmp gmove
      a[:] = b[:]      // a and b are distributed arrays. All array elements are assigned.

  • Example 4: Assignment statement 2: using distributed arrays

    #pragma xmp gmove
    a[1:9] = b[n:9]   // The number of elements in array must match 

  • Example 5: Assignment statement 3: using distributed arrays

    #pragma xmp gmove
    a[1:10] = c    // c is scalar variable. All elements of from a[1:10] are assigned the value of c.

  • Example 6: Using the in clause

    #pragma xmp nodes p(4)
    int a[4], b[4];
    #pragma xmp distribute t(block) onto p
    // (中略)
    #pragma xmp task on p(1:2)
    #pragma xmp gmove in
       a[1:2] = b[2:2]

    If the in clause is specified, then the assignment operation is performed after the node retaining the left side data acquires (gets) the corresponding right-side data by a remote copy operation. Correspondingly, the data referenced on the left side must be allocated to the current node set.

  • If the out clause is specified, then the assignment operation is performed after the node retaining the right-side data updates (puts) the corresponding left-side data by a remote copy operation. Correspondingly, the data referenced on the right side must be allocated to the current node set.

Local-view model

A one-sided communication program can be coded as follows.

  • [C] Before XMP specification ver. 1.0

    int tmp[N];
    #pragma xmp coarray tmp:[*]
    
    if(me == 1)   // me is a node number
      a[k:m] = tmp[j:m]:[2];
    
    #pragma xmp sync_memory

  • [C] After XMP specification ver. 1.1

    int tmp[N]:[*];
    int status;
    
    if(me == 1)   // me is a node number
      a[k:m] = tmp[j:m]:[2];
    
    xmp_sync_memory(&status);

    Declare array as a coarray, and m elements from tmp[j] in node #2 are assigned to m elements from a[k] in node #1. Sync_memory is to ensure the completion of the changes of coarray.