• 検索結果がありません。

hpfja v10 eng

N/A
N/A
Protected

Academic year: 2018

シェア " hpfja v10 eng"

Copied!
72
0
0

読み込み中.... (全文を見る)

全文

(1)

Language Speci cation

JAHPF(Japan Asso ciation for HighPerformance Fortran)

January 31, 1999

Version1.0

English Version1.0

November 11, 1999

(2)

High PerformanceFortran)meeting from July 1996toJanuary 1997and JAHPF meeting

from January 1997 to January 1999. This do cument sp eci es extensions to the HPF 2.0

language speci cation, whose copyright belongsto Rice University. HPF 2.0 spci cations

containedinthis do cument arereproducedunder p ermissionof RiceUniversity.

Copyright of this do cument b elongs toFujitsuLimited, Hitachi,Ltd., and NEC Corp o-

ration. Permissiontocopywithout feeall orpartofthis materialisgranted,providedthat

thecopyrightnoticeb elowapp ear,andnoticeisgiventhatcopyingisbyp ermissionofRice

Universityand thethree companiesab ove.

c

1994, 1995, 1996, 1997 Rice University, Houston, Texas. Permissionto copy without

feeallorpartofthismaterialisgranted,providedthattheRiceUniversitycopyrightnotice

and thetitle ofthis do cument app ear, andnoticeis giventhatcopyingisbyp ermission of

Rice University.

c

1996,1997, 1998,1999, FujitsuLimited,Hitachi,Ltd.,and NEC Corporation, Tokyo,

Japan.

(3)

1 Overview 1

2 Notation and Syntax 3

2.1 Notation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3

2.2 Syntaxof Directives : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4

3 HPF/JAExtension Related to Parallel Pro cessing Sp eci cation 7

3.1 Sp eci cation of REDUCTION Kind : : : : : : : : : : : : : : : : : : : : : : 7

3.1.1 Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7

3.1.2 Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9

3.1.3 Constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10

3.1.4 Examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13

4 HPF/JAExtension for Communication Optimization 15

4.1 AsynchronousTransferFunction : : : : : : : : : : : : : : : : : : : : : : : : 15

4.1.1 ASYNCIDdeclaration directive : : : : : : : : : : : : : : : : : : : : : 15

4.1.2 ASYNCHRONOUS directives : : : : : : : : : : : : : : : : : : : : : : 17

4.1.3 NOBUFFERclause inASYNCHRONOUSdirective : : : : : : : : : 22

4.1.4 ASYNCpre x : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 25

4.1.5 Notesonscopingunit : : : : : : : : : : : : : : : : : : : : : : : : : : 26

4.2 Extensionof SHADOWDirective : : : : : : : : : : : : : : : : : : : : : : : : 31

4.2.1 Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 33

4.2.2 Constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 34

4.2.3 EquivalencerelationforextendedSHADOWattributes : : : : : : : : : 34

4.3 Explicit Shadow : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 34

4.3.1 Terminology : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 35

4.3.2 De nitionand referenceof shadowobject : : : : : : : : : : : : : : : 36

4.3.3 De nedand unde ned statesforshadowobject : : : : : : : : : : : : 38

4.4 REFLECT Directive : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 39

4.4.1 Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 39

4.4.2 Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 39

4.4.3 Example: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 40

4.5 Extensionof HOME ClauseinON Directive : : : : : : : : : : : : : : : : : : 40

4.5.1 Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 40

4.5.2 Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 41

4.5.3 Example: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 43

4.6 LOCAL Clause and Directive : : : : : : : : : : : : : : : : : : : : : : : : : : 43

(4)

4.6.2 Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 45

4.6.3 Constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 45

4.6.4 Example: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 46

4.6.5 ComparisonofRESIDENTwith LOCAL[Reference] : : : : : : : : : 49

4.7 ReusingCommunicationSchedule: : : : : : : : : : : : : : : : : : : : : : : : 49

4.7.1 Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 51

4.7.2 Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 51

4.7.3 Constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 53

4.7.4 Example: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 53

5 Restriction and Mo di cation for HPF2.0 55

5.1 RestrictionforHPF2.0 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 55

5.2 Mo di cation forHPF2.0 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 57

A Syntax Rules 60

A.2 Notation and Syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 60

A.2.2 Syntaxof Directives : : : : : : : : : : : : : : : : : : : : : : : : : : : 60

A.3 HPF/JAExtension RelatedtoParallelPro cessing Sp eci cation : : : : : : : 61

A.3.1 Speci cation ofREDUCTION Kind : : : : : : : : : : : : : : : : : : 61

A.4 HPF/JAExtension forCommunication Optimization: : : : : : : : : : : : : 62

A.4.1 Asynchronous Transfer Function : : : : : : : : : : : : : : : : : : : : 62

A.4.2 Extension of SHADOWDirective : : : : : : : : : : : : : : : : : : : : 62

A.4.4 REFLECT Directive : : : : : : : : : : : : : : : : : : : : : : : : : : : 63

A.4.5 Extension of HOMEClause inONDirective : : : : : : : : : : : : : 63

A.4.6 LOCAL Clause and Directive : : : : : : : : : : : : : : : : : : : : : : 63

A.4.7 Reusing CommunicationSchedule : : : : : : : : : : : : : : : : : : : 64

B SyntaxCross-reference 65

B.1 NonterminalSymbolsThat AreDe ned : : : : : : : : : : : : : : : : : : : : 65

B.2 NonterminalSymbolsThat AreNot De ned: : : : : : : : : : : : : : : : : : 67

B.3 TerminalSymb ols : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 67

(5)

Chapter 1

Overview

Thisdo cumentspeci estheHPFextendedlanguage sp eci cationsHPF/JA1.0 de nedby

theJapan Asso ciationforHigh PerformanceFortran(JAHPF)tomore practicallyusethe

High PerformanceFortran(HPF). TheHPF/JA 1.0is designed asa set of extensions and

mo di cations tothe HPFlanguage speci cationde ned bytheHigh PerformanceFortran

Forum(HPFF).TheversionoftheHPFlanguagesp eci cationusedasabaseistheHPF2.0

language and its approved extension (High Performance Fortran Language Speci cation

version2.0,Jan. 31,1997) atthepresenttime(Jan. 1999).

TheHPF/JAextensionsp eci cationsareclassi edintothefollowingtwomajorpurp oses:

 Theenhancementofdescriptionperformanceforprogramparallelpro cessing andthe

enlargementof applicationrange

 Comp ensate forinsuciency of compiler capability in thecurrent stage byenabling

theuser todescrib eparallelpro cessing and optimizationin detail

Theextensions fall intothefollowingtwocategories.

1. Enlargementof descriptioncapability forparallelpro cessing

 Sp eci cation ofREDUCTION kind

... Enlarges theapplicationrange of theREDUCTION clause.

2. Optimizationof communication

 Asynchronoustransfer

... Overlapscommunicationb etweenprocessors withcomputationpro cessing.

 Extensionof SHADOWdirective

... Enablestheuser toselect full-shadowallocationwithfast accessspeed.

 REFLECTdirective

... Explicitly setthevalue oftheshadowarea.

 Extensionof HOMEclause inONdirective

... Enablestheuser tosp ecifyan active pro cessor,takingintoaccounta shadow

area.

 Extensionof LOCALclause ordirective

... Sp eci es thatcommunicationis unnecessary fordataaccess.

 Reuseof communicationschedule

... Ecientlypro cesses communicationrep eated inthesamepattern. 2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(6)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(7)

Chapter 2

Notation and Syntax

This chapter describ es the notational conventions employed in this do cument and syntax

of HPF/JAdirectives.

2.1 Notation

Thisdo cumentusesthesamenotationastheHPF2.0speci cationandFortran95standard,

including particularly thesyntax rules. The BNFdescription of the languagefeatures are

de ned in thesame style as the HPF speci cation and Fortran standard. Each HPF/JA

rulehasanidenti cationnumberof theformJsnntodistinguishtheHPF/JAsyntaxrules

from theHPF andFortransyntaxrules. InJsnn, scorrespondstoa chapternumber, and

nnindicates a two-digit sequence number. Nonterminalsnotde ned inthis do cument are

de nedintheHPF2.0sp eci cationorFortranstandard. Theruleofhavinganidenti cation

numb er oftheform Hsnnisde nedintheHPF2.0 sp eci cation,and therule ofhavingan

identi cationnumb er oftheformRsnn intheFortranstandard. Sometechnicalterms, for

example"mapping"and "storage unit",arede nedintheHPF2.0speci cationorFortran

standard.

The HPF/JA syntax rule is an extension of one similar to the HPF2.0 syntax rule. In

thiscase,thenameofanonterminalsymbolissuxedby-ja. When anonterminalsymbol

suchasnameorname-extendedintheHPFapprovedextensionspeci cationisrede ned,it

is therefore referred toasname-ja under theproviso that anyreference toname or name-

extendedtob ereplaced byname-jaintherest of thesyntaxrules.

Rationale. Throughoutthisdo cument,materialexplainingtherationaleforincluding

features, for choosing particular feature de nitions, and for making other decisions,

isset o in this format. Readersinterestedonly inthelanguagede nition maywish

toskip these sections, while readers interested in languagedesign may want toread

them more carefully. (End of rationale. )

Advicetousers. Throughoutthisdo cument,material thatisprimarilyofinterestto

users(includingmostexamplesofsyntaxand interpretation)isseto inthis format.

Readers interested only in technical material may wish toskip these sections, while

readerswantingamoretutorialapproachmaywanttoreadthemmorecarefully. (End

of adviceto users. )

Advice to implementors. Throughout this document, material that is primarily of

interest toimplementorsisset o in this format. Readersinterested only inthe lan- 2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(8)

guagede nition maywishtoskip thesesections, while readersinterestedincompiler

implementationmaywanttoread them more carefully. (End of adviceto implemen-

tors.)

2.2 Syntax of Directives

The HPF/JA directives are consistent with the HPF2.0 directives and Fortran syntax in

the following sense: if any HPF/JA directive wereto be adopted as a part of the future

HPF sp eci cation, the only change necessary to convertan HPF/JA program to an HPF

program would b e to replace the hpfja-directive-origin with !HPF$: and, if any HPF/JA

directive were to b e adopted as a part of the future Fortran standard, the only change

necessary toconvertan HPF/JA program to an Fortranprogram wouldbe toreplace the

hpfja-directive-origin withblanks.

HPF/JAdirectiveshave thefollowinggeneralformats:

J201 hpfja-directive-line is hpfja-directive-origin hpf-directive

J202 hpfja-directive-origin is !HPFJ

or CHPFJ

or *HPFJ

To use the HPF/JAsp eci cation de ned inthe next chapterand afterward, eachdirec-

tive must b egin with hpfja-directive-origin . HPF2.0 directives not based in the HPF/JA

sp eci cationmayalso b eginwithhpfja-directive-origin . HFP2.0 directivesmayb eginwith

thedirective-originofHPF2.0.

Advice to users. When using a system including an HPF2.0 compiler but not an

HPF/JA compiler, b egin each HPF2.0 directive with !HPF$. At least the HPF2.0

directivesb ecomevalidwhen theHPF2.0compilerisused, and thep ortabilityofthe

program isenhanced. When using only an HPF/JAcompiler, use only!HPFJ. Thus,

usersneednotcheckwhetherdirectivesareincluded intheHPF2.0 sp eci cation.

HPF/JA directives are designed to function as a correct HPF2.0 program even if

ignored. (Endof adviceto users.)

The rules related to character typ es (upp ercase and lowercase letters), line formats,

blanks, and continuation lines conform to the HPF2.0 directive syntax. However, an

HPF/JAdirective line mustnotb e continuedintoan HPF2.0directive line.

In the next chapter and afterward, some syntax rules are added and deleted

for speci cation-directive-extended (H206), executable-directive-extended (H207), and

executable-construct-extended (H208) de ned in the HPF2.0 sp eci cation. This up dated 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(9)

J203 specification-directive-ja is processors-directive

or subset-directive

or align-directive

or distribute-directive

or inherit-directive

or template-directive

or combined-directive

or sequence-directive

or dynamic-directive

or shadow-directive

or asyncid-directive

J204 executable-directive-ja is independent-directive-ja

or realign-directive-ja

or redistribute-directive-ja

or on-directive

or resident-directive

or asynchronous-directive

or asyncwait-directive

or reflect-directive

or local-directive

or index-reuse-directive

J205 executable-construct-ja is action-stmt

or case-construct

or do-construct

or if-construct

or where-construct

or on-construct

or resident-construct

or task-region-construct

or asynchronous-construct

or local-construct 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(10)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(11)

Chapter 3

HPF/JA Extension Related to

Parallel Processing Speci cation

3.1 Speci cation of REDUCTION Kind

Thepurp oseofthisextensionsp eci cationistoincreasethe exibilityof reductiondescrip-

tion.

The REDUCTION clauseof HPF2.0 do es not explicitly indicate a reduction kind. Instead,

thereferenceformatof areductionvariable isrestrictedtotheformatofareduction state-

ment(reduction-stmt),sothatthecompilercanidentifyeachreductionkind. (Section5.1.3

in HPF2.0 sp eci cation) The exibility of reduction description is therefore limited. The

MAXLOC and MINLOC computations frequently used by an application program are not in-

cludedintheHPF2.0reductiondescription. Theusercannotthereforesp ecifyINDEPENDENT

fora loop including thosecomputations.

Thisextension sp eci cationde nes a reduction kindina REDUCTION clause sothata re-

ductionvariablecanb ereferencedinanyformatandanINDEPENDENTloopcanb edescrib ed

including theMAXLOCand MINLOCcomputations.

3.1.1 Syntax

The syntax rules of the INDEPENDENTdirective (H501 and H503 in theHPF sp eci cation,

Section5.1) aremodi edasfollows:

J301 independent-directive-ja is INDEPENDENT [, new-clause ]

[, reduction-clause-ja-list ]

J302 reduction-clause-ja is REDUCTION

( [reduction-kind :] reduction-spec-list )

J303 reduction-kind is reduction-operator

or reduction-function

or maxmin-kind 2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(12)

J304 reduction-operator is +

or *

or .AND.

or .OR.

or .EQV.

or .NEQV.

J305 maxmin-kind is FIRSTMAX

or FIRSTMIN

or LASTMAX

or LASTMIN

J306 reduction-spec is reduction-variable [ / location-variable-list / ]

J307 location-variable is scalar-variable-name

reduction-functionis de nedintheHPF sp eci cation, Section5.1.3.

Thefollowingconstraints areaddedtoSection5.1 intheHPF sp eci cation:

Constraint: When reduction-kind is maxmin-kind, reduction-spec must have location-

variable-list. When reduction-kind is not maxmin-kind or reduction-kind is

omitted,reduction-specmust nothavelocation-variable-list .

Constraint: Whenreduction-kindismaxmin-kind,reduction-variableinreduction-specmust

b escalar-variable-name.

Constraint: The type of variable sp eci ed in reduction-variable mustbe de ned for each

reduction-kindvalue asfollows:

Logicaltype for.AND., .OR., .EQV.,and.NEQV.

Integer typ e forIAND, IOR, and IEOR

Numerictype for+and *

Integer or real type for MAX, MIN, FIRSTMAX, FIRSTMIN, LASTMAX, and

LASTMIN

Constraint: reduction-variablespeci edinreduction-clausewithoutreduction-kindmustbe

referencedinthereductionstatementformatinthelo opde nedinSection5.1.3

of the HPF2.0 sp eci cation. (reduction-variable sp eci ed in reduction-clause

withreduction-kindmay b ereferenced inanyformatina loop.)

Insection 5.1of theHPF2.0sp eci cation, the fth constraintis mo di ed asfollows:

Constraint: Avariablesp eci edasreduction-variableorlocation-variablemustnotb espec-

i edtwoormore timesinthesameindependent-directive. Itmustnotalso b e

sp eci ed in new-clauseand reduction-clause within the range of the succeed-

ingdo-stmt, forall-stmt and foral l-construct (that is, loop b o dy inthe source

program)towhichtheindependent-directiveapplies.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(13)

3.1.2 Semantics

The INDEPENDENT directive asserts that the iterations of a DO lo op do not mutually

interfere. (Section5.1 inHPFsp eci cation)The conditionof thisinterference isrelaxedin

theREDUCTION clausewith reductionkind asfollows:

 The second exception in the rst interference condition is mo di ed as follows. The

mo di ed contentisunderlined.

{ Exception: Ifa variable app ears ina REDUCTION clausewithout reduction kind,

then assignments to it by reduction statementsin the range of the DO lo op do

not interfere with assignmentsto it by other reduction statements in the same

lo op. The reasonforthis isexplainedinSection 5.1.3.

The followingexceptionisadded:

{ Exception: If a variable app ears as a reduction variable or a lo cation variable

ina REDUCTION clausewith reduction kind,then assignmentsto itdo notinter-

fere with assignments to it ina di erentiteration of the DO lo op. The DO lo op

musthowevercomp ose areductioncomputationwiththereductionkind andthe

reduction variablecorrespondingtothevariable.

 Thesecond exceptioninthesecondinterferencecondition ismodi edasfollows. The

mo di ed contentisunderlined.

{ Exception: Ifa variable app ears ina REDUCTION clausewithout reduction kind,

thenassignmentstoitbyreductionstatementsintherangeoftheDOlo opdonot

interfere with the allowed uses of it by reduction statements in the same lo op.

The reasonforthisis explainedinSection 5.1.3.

The followingexceptionisadded:

{ Exception: Ifa variableapp earsasareduction variableoralo cationvariablein

a REDUCTIONclause withreduction kind,thenassignmentsto itdonot interfere

withuses ofitina di erentiterationoftheDO lo op. The DOloopmusthowever

comp ose a reduction computation with the reduction kind and the reduction

variablecorresponding tothevariable.

Comp osinga reductioncomputationis de nedasshownb elow. Consideringa certain iter-

ation of a DO loop asone blo ck. Let the value of variable X at theentryof theiteration

b e X in

and let the value atthe exit b eX out

. X in

is a virtual valuein the sensethat it

do es not matterwhether X canactuallytake thevalue ornot. Dep endingon thevalue of

X in

,X out

maynotb ede ned.

 Ifthere existan asso ciativeop eration f anda valuecnotdep ending on R in

andthe

followingformula holds foranyvalueof R in

,the iteration ina DO lo op comp oses a

reduction computationfora variableR

R out

=f(R in

;c)

The valuecis calleda reduction element;f mustb e oneof the op erationsde nedin 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(14)

Reductionkind f(x;y )

+ x + y

* x * y

.AND. x.AND.y

.OR. x.OR.y

.EQV. x.EQV.y

.NEQV. x.NEQV.y

MAX MAX(x, y)

MIN MIN(x, y)

IAND IAND(x, y)

IOR IOR(x, y)

IEOR IEOR(x, y)

FIRSTMAX MAX(x, y)

FIRSTMIN MIN(x, y)

LASTMAX MAX(x, y )

LASTMIN MIN(x, y )

Table3.1: Computationforeachreductionkind

 Ifall iterationsinaDO lo opcomposeareduction computationwiththesameop era-

tionforavariableR,theDOloopcomp osesareductioncomputationforthevariable

R.

If the reduction kind is \+" or \*", a computation error may o ccur dep ending on the

computationsequenceinanactualcomputer. Thevalueofcmayb eunstabilizeddep ending

onthevalue ofR in

. Takingintoaccount thisp oint, whenthereductionkind is\+"or\*"

andthere isasequence ofvaluesnotdep endingonR in

,c

1

;c

2

;111;c

n

(n0)satisfying the

b elow formulab elowinstead of theab ove formula,areduction computation is assumedto

b ecomposed.

R out

=f(111f(f(R in

;c

1 );c

2 )111;c

n )

3.1.3 Constraints

Inthissection,areductionvariableforreductioncomputationiswrittenasRandalo cation

variableasL

1

;111;L

m

(m0). Ifthevalueand thestatus ofeachvariableatthe endof a

certainiterationofaDOlo opdonotdep endonthevalueofR in

;L

1 in

;111;L

m in

,thevalue

and thestatusare de nedtob einvariant forthereduction computationinthe iteration.

Ifa valueand a status areinvariant forreduction computation in all iterations of a DO

lo op,they are de nedtob einvariant forreduction computationina DO loop.

1. ADOloopsp eci ed byINDEPENDENThavingaREDUCTION clausewithreduction kind

mustcomp osereduction computationforthereduction variable. Thecorresp ondence

b etween thereduction kind and reduction computation must conform to thecombi- 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(15)

2. Whenthereductionkindismaxmin-kind,alliterationsintheDOlo opmustsatisfythe

followingconditions forall lo cation variablesL

k

corresp onding toreduction variable

R:

 When R in

is within an R up date range, L

k out

must b e de ned, and its value

must b einvariantforthe reductioncomputation.

 When R in

is within an R non-up date range, L

k out

mustb e unde ned if L

k in

isunde ned and also musthavethesamevalueasL

k in

ifL

k in

is de ned.

The following table lists theR up date range and R non-up date range dep ending on

reduction kinds. In this table,c

i

indicatesa reduction element foriterationi.

Reductionkind R up daterange R non-up date range

FIRSTMAX R

in

<c

i

R in

c

i

FIRSTMIN R

in

>c

i

R in

c

i

LASTMAX R

in

c

i

R in

>c

i

LASTMIN R

in

c

i

R in

<c

i

3. The values and attributes of all data objects (excluding a reduction, lo cation, or

NEWvariable)and the le andunit status(presenceorabsence, contentsofrecords,

le p ositionand other characteristicsinquired bytheINQUIRE statement)mustb e

invariant forallreduction computations composedbytheDO lo op.

4. A reduction variable sp eci ed in a REDUCTION clause with reduction kind must be

invariant for all reduction computations excluding those comp osed by the variable

itself.

5. A lo cation variable speci ed in a REDUCTION clause with reduction kind must b e

invariant for all reduction computations excludingthose comp osed bythe reduction

variablespeci edbythesamereduction-spec.

Rationale. Basis ofconstraint1

A reduction variable without reduction kind can be accessed only when a reduction

statement is used. (Section 5.1.3 inHPF2.0 sp eci cation) A reduction variable with

reduction kind canbeaccessed freely; however,it mustcomposea reduction compu-

tationasthewhole. REDUCTION withoutreductionkindisrestrictedinthesyntax;

onewith reduction kind in the semanticsof computation. Forexample, sum has an

add iterationasa condition regardless ofthewayof description. (End of rationale. )

Advice to users. Program errors related to a combination of reduction kind and

reduction computation cannot b ecompletely detectedbya compiler. The user must

p erformappropriateprogramming, understandingthesemanticsofreduction compu-

tation. (Endof adviceto users.)

Example. Example ofconstraint1

DO I=1,100

X = X+A(I)

IF (I.EQ.3) X = X+B 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(16)

Whenthereduction kindis\+", reductionelementcforeachiterationisobtainedby

thefollowingformularegardless of thevalueof X.

 AtI6=3 ,X out

=X in

+A(I)

That is,c=A(I)

 AtI=3 ,X out

=X in

+A(3)+B

That is,c=A(3)+B

(More strictly,c

1

=A(3); c

2

=B, taking intoaccount an errordepending onthe

computationsequence. )

Therefore, this DOlo op satis esconstraint1. fordescribing aREDUCTION clause

REDUCTION(+:X)

Example. Exampleof constraint2.

!HPFJ INDEPENDENT, REDUCTION(FIRSTMAX:AMAX/ILOC/)

DO I=1,N

IF (AMAX.LT.A(I)) THEN

AMAX = A(I)

ILOC = I

END IF

END DO

When thevalueofAMAX in

ischanged,AMAX out

and ILOC out

changeasshownb elow,

considering theprogramcontent.

AMAX in

AMAX out

ILOC out

-HUGE A(I) I

.

.

. A(I) I

A(I) A(I)=AMAX in

ILOC in

.

.

. AMAX

in

ILOC in

HUGE AMAX

in

ILOC in

From this table, reduction element c is assumed to b e A(I). When AMAX in

< A(I),

ILOC out

= I holds. When AMAX in

 A(I), ILOC out

= ILOC in

holds. Taking into

account theseresults, constraint2. isassumed tobesatis ed.

If a conditional clause of an IF statement changes to (AMAX.LE.A(I)), the table

changes asfollows:

AMAX in

AMAX out

ILOC out

-HUGE A(I) I

.

.

. A(I) I

A(I) A(I)=AMAX in

I

.

.

. AMAX

in

ILOC in

in in

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(17)

Inthis case,whenAMAX in

isequaltoA(I), ILOC out

isnotnecessaryequaltoILOC in

.

Asa result,constraint2. isnotsatis ed.

3.1.4 Examples

!HPFJ INDEPENDENT, REDUCTION(MIN:AMIN), REDUCTION(+:S1,S2), NEW(TMP)

DO I = 1,N

IF(A(I).LT.AMIN) AMIN=A(I)

TMP = S1+B(I)

S1 = TMP+C(I)

S2 = ADD(S2,D(I))

END DO

ADD(x,y)is assumedtob e auser-de nedfunctiononly forcomputing x + y.

An exampleusingFIRSTMAX follows:

!HPFJ INDEPENDENT, NEW(I), REDUCTION(FIRSTMAX:AMAX/ILOC,JLOC/)

DO J=1,N

DO I=1,M

IF(AMAX.LT.A(I,J)) THEN

AMAX=A(I,J)

ILOC=I

JLOC=J

END IF

END DO

END DO 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(18)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(19)

Chapter 4

HPF/JA Extension for

Communication Optimization

4.1 Asynchronous Transfer Function

The asynchronous transfer function performs data transfer b etween pro cessors in parallel

withexecutionof anotherexecutable statement.

This function is de ned by a combination of an executable directive for instructing the

start of data transfer with one for waiting for the end. These directives corresp ond with

thesameID.

4.1.1 ASYNCID declaration directive

The ASYNCID declaration directive declares one ID each to corresp ond with statements

forstartingand ending theasynchronoustransfer,resp ectively.

4.1.1.1 Syntax

Add asyncid-directivetospeci cation-directive-extended(H206).

J401 asyncid-directive is ASYNCID async-id-list

J402 async-id is async-id-name

Add ASYNCIDand SAVE tocombined-attribute-extended(H801).

Constraint: WhenSAVE isde ned incombined-directive,ASYNCID mustalso bede ned.

Example.

ASYNCID ID1,ID2

ASYNCID :: X

ASYNCID,SAVE :: S,T,U 2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(20)

4.1.1.2 Semantics

ASYNCID directive Declares that async-id is an asynchronous identi er. To use an

asynchronousidenti er,b e suretodeclare this statement.

Theasynchronous identi erhasthe followingfeatures:

 A local entity belonging to class (1). (See Section 14.1.2 in the Fortran standard.)

Therefore,its nameisvalidonlyinascopingunit(pro cedureand soon), anditmust

not be the same as another local entity 1

b elonging to class (1) in the same scoping

unit.

 Itcanbeasso ciatedregardlessofascopingunitbyusing theuseasso ciation(declared

inamo duleand referencedinmultiplescopingunits)orhostasso ciation(declaredin

a hostpro cedure and sharedbyahost-slaveprocedure).

 The asynchronousidenti erhas eithertheenabledstateordisabledstate. Theinitial

state is the disabled state. When an asynchronous identi er is referred to in the

ASYNCHRONOUSdirective (Section4.1.2), itis placed intheenabledstate. When

anasynchronousidenti erisreferredtointheASYNCWAITdirective(Section4.1.2),

itis placed inthedisabled stateagain.

 The asynchronousidenti ercan have a SAVE attribute. An asynchronous identi er

having the SAVE attribute holds the asso ciation, allocation, and enabled/disabled

states aftertheRETURNorEND statement isexecuted.

SAVE Declaresthatan asynchronousidenti erhasa SAVEattribute.

Rationale. Reason whythe asynchronousidenti erisregarded asanew local entity

Thereisa prop osal inwhichtheasynchronousidenti eris assumedtobean integer-

typ e variable(the namehas ameaning) orintegerexpression (thevaluehas a mean-

ing). However, for thefollowing reason theasynchronous identi eris assumed tobe

a newlo cal entity.

 Clear syntax

{ The userand languageprocessorcanrecognizeclearlythatthenameisused

asanasynchronous identi er.

Programreadabilityisimproved. Thelanguagepro cessorhasmoreopp ortu-

nitiestodetectanerror. Optimizationinthelanguagepro cessorispromoted.

{ The asynchronousidenti eris de nedonly indirectives.

Theprogram canb emodi ed byaddingdirectives(withoutcorrecting For-

transtatements). IfaFortranvariableorexpression isusedasanidenti er,

variablesused only for declaration may b e de ned atcompilation with se-

quentialinterpretation.

 The languagepro cessorcanb e implementednaturally and easily.

The runtime data structure can b e prepared statistically using the declaration

ofan identi erasa trigger. Ifthevalue ofavariableorexpression is usedasan

identi er, a wasteful structure may b e generated, and implementation may b e

1

Includesanamedvariable,statementfunction,built-inpro cedure,and soon. Inaddition,apro cessor 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(21)

dicultdependingon architecture. (Forexample,whenthebasicintegertyp eis

32 bits and an address space is 64 bits in length, the value of the basicinteger

typ e isto o smallto save addressdata.)

(End of rationale. )

4.1.1.3 Example

See the exampleshown inSection 4.1.2.3. An example requiring the SAVE declaration is

showninSection4.1.5.

4.1.2 ASYNCHRONOUS directives

Consistingof a simpledirectiveand sp eci cation syntax, theASYNCHRONOUSdirective

sp eci esthestartofasynchronoustransfer. The ASYNCWAITdirectiveisusedtowaitfor

thecompletion.

4.1.2.1 Syntax

Addasynchronous-directiveandasyncwait-directivetoexecutable-directive-extended(H207).

Also addasynchronous-constructtoexecutable-construct-extended(H208).

Simple ASYNCHRONOUS directive

J403 asynchronous-directive is ASYNCHRONOUS asynchronous-stuff

J404 asynchronous-stuff is ( [ ID = ] async-id ) [, nobuffer-clause ]

Example.

ASYNCHRONOUS (ID=ID1)

ASYNCHRONOUS(ZZ)

ASYNCHRONOUS directive construct

J405 asynchronous-construct is

hpfja-directive-origin block-asynchronous-directive

block

hpfja-directive-origin end-asynchronous-directive

J406 block-asynchronous-directive is ASYNCHRONOUS asynchronous-stuff BEGIN

J407 end-asynchronous-directive is END ASYNCHRONOUS

Example.

!HPFJ ASYNCHRONOUS(ID1) BEGIN

A(:)=B(1:100)

FORALL(I=1:M,J=1:N) S(I,J)=T(J,I)

!HPFJ END ASYNCHRONOUS 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(22)

ASYNCWAIT directive

J408 asyncwait-directive is ASYNCWAIT ( [ID =] async-id )

Example.

ASYNCWAIT(ID=ID1)

4.1.2.2 Semantics

The following executable statements and directives are called asynchronously executable

statements.

(1) Built-in assignment statement 2

(2) Simple FORALLstatement whoseb o dyis a built-inassignment statement

(3) REDISTRIBUTE directive

(4) REALIGNdirective

(5) REFLECTdirective (HPF/JA extension)

Fordetails of nobu er-clause,see Section4.1.3.

SimpleASYNCHRONOUS directive Instructs thesystemthat,fortheimmediately

succeedingasynchronouslyexecutablestatement,itis p ossibletostartthe subsequentpro-

cessing without waiting forthecompletionof data transferresulting fromexecution ofthe

statement.

The transfer identi erasync-id isplaced into theenabled state byexecuting the simple

ASYNCHRONOUSdirective.

ASYNCHRONOUS directive construct Instructs the system that, for all asyn-

chronously executable statements included in block, it is p ossible tostart the subsequent

pro cessingwithout waitingforthecompletionofdatatransferresultingfromtheexecution

of thestatements.

Thetransfer identi erasync-id isplaced intotheenabledstatebyexecutingthe ASYN-

CHRONOUSconstruct.

The ASYNCHRONOUS directive constructis equivalent to a representationusing mul-

tiplesimpleASYNCHRONOUSdirectives.

2

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(23)

!HPFJ ASYNCHRONOUS(ID=id)

BEGIN

statement-1

statement-2

...

statement-n

!HPFJ END ASYNCHRONOUS

...

!HPFJ ASYNCWAIT(ID=id)

,

!HPFJ ASYNCHRONOUS(ID=id-1)

statement-1

!HPFJ ASYNCHRONOUS(ID=id-2)

statement-2

...

!HPFJ ASYNCHRONOUS(ID=id-n)

statement-n

...

!HPFJ ASYNCWAIT(ID=id-1)

!HPFJ ASYNCWAIT(ID=id-2)

...

!HPFJ ASYNCWAIT(ID=id-n)

ASYNCWAIT directive Instructs the system to wait for the completion of asyn-

chronoustransferstartedbythesimpleASYNCHRONOUSdirectiveorASYNCHRONOUS

directiveconstructhavingthesameasync-id.

The transfer identi er async-id is placed into the disabled state by executing the

ASYNCWAITdirective.

4.1.2.3 Example

REAL A(N),S(M,N),T(N,M)

!HPFJ ASYNCID ID1,ID2 ! async-id

!HPF$ DISTRIBUTE A(BLOCK)

!HPF$ DISTRIBUTE (*,BLOCK) :: S,T

...

!HPFJ ASYNCHRONOUS (ID=ID1)

FORALL(I=1,M) T(:,I)=A(:)*10.0 ! Start of transfer to T

... ! T non-access processing

!HPFJ ASYNCWAIT (ID=ID1) ! End of transfer to T

!HPFJ ASYNCHRONOUS (ID2) BEGIN

!HPF$ REDISTRIBUTE A(BLOCK)

FORALL(I=1:M,J=1,N) S(I,J)=T(J,I)

!HPFJ END ASYNCHRONOUS ! Start of transfer to A and S

... ! A and S non-access processing

!HPFJ ASYNCWAIT(ID2) ! End of transfer to A and S

...

4.1.2.4 Constraints

Basic constraints

1. The executable statements and directives to b e processed by the simple ASYN-

CHRONOUS directive and ASYNCHRONOUS directive construct must be asyn-

chronouslyexecutablestatements(see Section4.1.2.2).

2. At execution of the simple ASYNCHRONOUS directive or ASYNCHRONOUS di- 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(24)

executionoftheASYNCWAITdirective,thetransferidenti ermustb eintheenabled

state.

Constraints related to object variable In resp ect tothe executable statements and

directivesto be pro cessed bythe ASYNCHRONOUSdirective, thefollowing variablesare

calledasynchronous objects.

Objectstatement Asynchronous object

Built-in assignment statement Left-hand side

Simple FORALLstatement Left-hand side of assignment statements in

theb ody

REDISTRIBUTE directive Distributee and all data objects ultimately

aligned toit

REALIGNdirective Alignee

REFLECTdirective re ect-object

1. The statements executed in a p eriod from the execution of the ASYNCHRONOUS

statement to the execution of the corresp onding ASYNCWAIT statement must not

include the reference of an asynchronousobject. However, the following reference is

allowed:

 Reference forinquiring the attributes (type, shap e, allo cation state, and soon)

of anasynchronousobject

 Reference for referencing mapping (reference in the HOME clause of the ON

statement,and soon). Notallowedfortheasynchronousobjects of theREDIS-

TRIBUTEand REALIGNdirectives,however.

Example.

REAL A(M,N),B(M,N)

!HPFJ ASYNCID :: ID1

!HPF$ DISTRIBUTE B(BLOCK,*)

!HPF$ ALIGN A(:,:) WITH B(:,:)

!HPF$ DYNAMIC A,B

...

!HPFJ ASYNCHRONOUS(ID=ID1)

!HPF$ REDISTRIBUTE B(*,BLOCK) ! The asynchronous variables are A

... ! and B.

C(I)=A(I) ! Prohibited: A cannot be referenced.

B=D+E ! Prohibited: B cannot be defined.

CALL SUB(B) ! Prohibited: B cannot be referrenced

! as an actual parameter.

DEALLOCATE(A) ! Prohibited: A cannot be placed in

! the undefined state.

!HPF$ REALIGN A(:,:) WITH T(:,:) ! Prohibited: A cannot be realigned.

...

!HPFJ ASYNCWAIT(ID=ID1)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(25)

Rationale. Reason whyreference isprohibited asan actualargument

Thisisb ecausethevalueofanactualargumentmayb eoverwrittenbythevalue

of a dummyargument after itis transferred bythe asynchronoustransfer. The

compiler may perform the argument passing by value association (metho d of

copyingan actualargumenttoalo calvariableattheentryofasubprogramand

returning thelo calvariabletotheoriginalactualargument). The compilermay

alsop erformtheautomaticredistribution attheentryandexitofasubprogram.

(End of rationale. )

2. IntheASYNCHRONOUS directiveconstruct, avariablede ned asanasynchronous

objectmustnotbe referencedagain after theasynchronouslyexecutablestatement.

Example. The underlinedvariablesare asynchronousobjects.

!HPFJ ASYNCHRONOUS(ID=ND) BEGIN

A(1:N)=B(1:N)

C(:)=A(:)+D(:) !(a) Not allowed.

P(:)=D(:) !(b) Allowed.

!HPF$ REALIGN B(:) WITH T(:) !(c) Allowed.

A(N+1:NN)=E(N+1:NN) !(d) Allowed.

FORALL(I=1:9) G(I+1)=G(I) !(e) Allowed.

!HPFJ END ASYNCHRONOUS

(a) Since the array section of A is an asynchronous object, it cannot b e

referenced.

(b) D is referenced multiple times; this is allowed since D is not an asyn-

chronousobject.

(c) Since Bis rst de nedasan asynchronousobject, itis allowed.

(d) ThearraysectionofAisanasynchronousobject, butitisnotoverlapped.

Aistherefore allowed.

(e) An asynchronousobjectcanb e referencedwith thesamestatement.

Constraints of asynchronous realignment In case of a REALIGN directive con-

straintsareasfollows:

1. The following pro cessing must not b e p erformed directly or indirectly for the ulti-

matelyaligned target inthe REALIGNdirective (variable Cinthis example)within

aperio dfrom thestartofasynchronoustransfer bytheASYNCHRONOUSdirective

totheexecutionof thecorrespondingASYNCWAITdirective.

 Deallo cation andmaking allo cation unde ned

 Referenceasan actualargumentfor apro cedure call

 Remapping (includingasynchronousone)

Example.

!HPFJ ASYNCID ID1 ! async-id

REAL A(100,200) 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(26)

!HPF$ DISTRIBUTE C1(BLOCK)

!HPF$ ALIGN B1(:,*) WITH C1(:)

REAL B2(100,200),C2(200)

!HPF$ DISTRIBUTE C2(BLOCK)

!HPF$ ALIGN B2(*,:) WITH C2(:)

!HPF$ ALIGN A(:,:) WITH B1(:,:) ! A is first aligned to B1.

!HPF$ DYNAMIC A,B1,B2,C1,C2

...

!HPFJ ASYNCHRONOUS(ID=ID1)

!HPF$ REALIGN A(:,:) WITH B2(:,:) ! Start of asynchronous

... ! realignment of A

!HPF$ REDISTRIBUTE C2(BLOCK,:) ! Prohibited.

CALL FOO(C2) ! Prohibited.

DEALLOCATE C2 ! Prohibited.

...

!HPFJ ASYNCWAIT(ID=ID1) ! End of asynchronous

... ! realignment of A

Constraint of active pro cessor The ASYNCHRONOUS directive and corresp onding

ASYNCWAITdirective mustb eexecuted on thesameset ofactive pro cessors.

Example.

!HPF$ ON (P(1:4)) BEGIN ! The active processors are P(1:4).

!HPFJ ASYNCHRONOUS(ID=ID1)

...

!HPFJ ASYNCHRONOUS(ID=ID2)

...

!HPFJ ASYNCHRONOUS(ID=ID3)

...

!HPF$ END ON

!

!HPFJ ASYNCWAIT(ID=ID1) ! Not allowed. All processors are active.

!HPF$ ON (P(5:8)) BEGIN

!HPFJ ASYNCWAIT(ID=ID2) ! Not allowed. The active processors are P(5:8).

!HPF$ END ON

!HPF$ ON (P(1:4)) BEGIN

!HPFJ ASYNCWAIT(ID=ID3) ! Allowed. The active processors are P(1:4).

!HPF$ END ON

4.1.3 NOBUFFER clause in ASYNCHRONOUS directive

A NOBUFFER clauseis supplied toecientlyp erform theasynchronoustransferwith an 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(27)

4.1.3.1 Syntax

The NOBUFFER clause can be sp eci ed optionally in asynchronous-directive and block-

asynchronous-directive(Section 4.1.2.1).

J409 nobuffer-clause is NOBUFFER

Example.

ASYNCHRONOUS (ID=ID1), NOBUFFER

ASYNCHRONOUS(ZZ),NOBUFFER

Example.

!HPFJ ASYNCHRONOUS(ID=Z), NOBUFFER BEGIN

A(:)=B(:)

FORALL(I=1:N) S(:,I)=T(I,:)

!HPFJ END ASYNCHRONOUS

4.1.3.2 Semantics

The followingstatementsarecalled asynchronouslyexecutablestatements without bu er.

(1) Assignmentstatementwhoseright-handsideconsistsofonlyonevariable(wholearray,

arraysection,arrayelement,orscalarvariable)

(2) FORALLstatementwhoseassignment statementisaccording toitem(1)

TheNOBUFFER clausedeclares that,fortheright-handside ofan asynchronouslyexe-

cutable statementwithout bu er, the followingpro cessing is notp erformed directly orin-

directlywithin ap erio dfrom thestartofasynchronoustransferbytheASYNCHRONOUS

directivetothe executionofthe ASYNCWAITdirective.

 Valuede nition and making avalueunde ned (Valuereference isallowed.)

 Deallo cation andmaking allo cation unde ned

 Referenceasan actualargumentfor pro cedure calling

 Remapping (includingasynchronousone)

 Referenceforreferencing mapping(HOMEclauseinON directive,andso on)

Rationale. The NOBUFFER clause do es not force the compiler to p erform the

asynchronous transfer without bu er; it is used to report to the compiler that the

conditions for enabling the asynchronous transfer without bu er are satis ed. (End

of rationale. )

Adviceto implementors. The ASYNCHRONOUSdirectivehavingtheNOBUFFER

clause should b e executed bya transfer metho d without bu er ifecient. However,

this metho dis notmandatory. Selectan ecientmetho d dep endingon thetype and

architectureofa describ edassignmentstatement. (End of adviceto implementors. ) 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(28)

4.1.3.3 Example

REAL A(1000),B(1000)

REAL C(100,100),D(100,100)

INTEGER E(200),F(100,200,300)

REAL S(500,20),T(800,20)

INTEGER IX1(N),IX2(N)

!HPFJ ASYNCID :: DD

...

!HPFJ ASYNCHRONOUS(ID=DD), NOBUFFER BEGIN

A=B ! Transfer from whole array

! to whole array

E=F(J,:,K+1) ! Transfer from array section

! to whole array

FORALL(I=1,N) C(:,I)=D(I,:) ! transpose transfer between

! array sections

S(IX1,:)=T(IX2,:) ! Transfer with vector subscript

!HPFJ END ASYNCHRONOUS

... ! Here, A, E, C, and S are not accessed;

... ! B, F, D, and T are not accessed,

! excluding the reference of their values.

!HPFJ ASYNCWAIT(DD)

4.1.3.4 Constraints

1. The executable statement and execution statement to b e processed by the simple

ASYNCHRONOUS directive and ASYNCHRONOUSdirective constructhavingthe

NOBUFFER clause must b e asynchronously executable statements without bu er

(see Section3.2).

2. In an ASYNCHRONOUS constructhaving the NOBUFFER clause, a variable that

app ears in the right-hand side of an asynchronously executable statement without

bu er mustnot app earintheconstructasan asynchronousobject.

Example. Theunderlinedvariableis de nedintheright-handside of anasynchronously

executablestatement withoutbu er.

!HPFJ ASYNCHRONOUS(ID=ND), NOBUFFER BEGIN

A(1:N)=B(1:N)

B(:)=C(:) !(a) Not allowed.

D(:)=C(:) !(b) Allowed.

FORALL(I=1:9) G(I+1)=G(I) !(c) Not allowed.

S(1:100)=T(1:100)

T(101:200)=U(1:100) !(d) Allowed.

!HPFJ END ASYNCHRONOUS

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(29)

(a) The range of B(1:N)is referenced intheright-handside.

(b) C(:) isde nedmanytimes;however,itisallowedb ecauseitisnotde ned

asanasynchronousobject.

(c) G isoverlapped. Overlapping is notallowed eveninthesamestatement.

(d) Any array elements of T are notoverlapped in the asynchronous object

andright-handside.

4.1.4 ASYNC pre x

The asynchronous execution of the REDISTRIBUTE, REALIGN, and REFLECT direc-

tivescanb e describ edbycombiningwithanASYNCHRONOUS directive. Tomore easily

representasynchronousexecution, anASYNCpre xis supplied.

4.1.4.1 Syntax

Mo difyredistribute-directive(H802)and realign-directive(H803)asfollows:

J410 redistribute-directive-ja is [ async-prefix ] redistribute-directive

J411 realign-directive-ja is [ async-prefix ] realign-directive

J412 async-prefix is ASYNC ( [ ID = ] async-id )

Rationale. Thesyntaxofre ect-directiveisde nedinSection4.4. (Endofrationale. )

Example.

ASYNC(ID=Z) REDISTRIBUTE D(BLOCK,*) ONTO PROC

ASYNC (ID) REDISTRIBUTE (CYCLIC) ONTO P :: T1,T2

ASYNC(ID2) REALIGN A(:,:) WITH B(:,:)

ASYNC(ID=Y) REALIGN (*,I) WITH T(I+1) :: A,B,C

ASYNC(ID=MM) REFLECT A

4.1.4.2 Semantics

Anexecution statement(REDISTRIBUTE, REALIGN,orREFLECTdirectiveonly)hav-

ing async-pre x is equivalent to the following combination with the ASYNCHRONOUS

directive.

!HPFJ ASYNC(ID=id) executable-directive

,

!HPFJ ASYNCHRONOUS(ID=id)

!HPFJ executable-directive

4.1.4.3 Example

!HPFJ ASYNCID ID1 ! async-id

REAL A(100,100),D(100,100)

!HPF$ ALIGN A(I,J) WITH D(I,J)

!HPF$ DISTRIBUTE D(*,BLOCK)

!HPF$ DYNAMIC A,D 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(30)

!HPFJ ASYNC(ID1) REDISTRIBUTE D(BLOCK,*) ! Start of redistribution

! for A and D

... ! A and D non-access processing

!HPFJ ASYNCWAIT(ID1) ! Completion of redistribution

! for A and D

... ! A and D can be accessed with new mapping.

4.1.5 Notes on scoping unit

The constraints of the ASYNCHRONOUS and ASYNCWAIT directives are describ ed in

Section 4.1.2.4. This section provides notes concerning programming to meet those con-

straints.

4.1.5.1 Asynchronous transfer crossing scoping units

WhenASYNCHRONOUSandASYNCWAITdirectivesareindi erentscopingunits,de ne

theprogramcarefullysothattheallo cation of anasynchronousidenti erand objectis not

made unde ned until the ASYNCWAIT directive is executed. To prevent this problem,

globallydeclare theasynchronousidenti erand objectbyoneof thefollowingmethods:

 Declare theasynchronous identi er and object in a mo dule referenced commonly in

those scopingunits.

 When those scoping unitsare de ned by a relationship b etweena host and internal

pro cedures or b etween internal pro cedures having a common host pro cedure, they

must b edeclaredinthe hostpro cedure.

Example. Usea moduletode netheasynchronoustransfer crossingpro cedures.

 Mo dule

MODULE MOO

!HPFJ ASYNCID Z

REAL A(100),D(100)

!HPF$ ALIGN A(:) WITH D(:)

!HPF$ DISTRIBUTE D(BLOCK)

!HPF$ DYNAMIC A,D

END

 Callerprogram

PROGRAM MAIN

USE MOO

...

CALL ASYNC_SUB

...

CALL ASYNCWAIT_SUB

...

END

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

(31)

 Subroutine startingthetransfer

SUBROUTINE ASYNC_SUB

USE MOO

!HPFJ ASYNC(Z) REDISTRIBUTE D(CYCLIC)

END SUBROUTINE

 Subroutine waitingforthe transfer

SUBROUTINE ASYNCWAIT_SUB

USE MOO

!HPFJ ASYNCWAIT(Z)

END SUBROUTINE

Example. Usea host asso ciationtode nethesame content.

PROGRAM MAIN

!HPFJ ASYNCID Z

REAL A(100),D(100)

!HPF$ ALIGN A(:) WITH D(:)

!HPF$ DISTRIBUTE D(BLOCK)

!HPF$ DYNAMIC A,D

...

CALL ASYNC_SUB

...

CALL ASYNCWAIT_SUB

...

CONTAINS

SUBROUTINE ASYNC_SUB

!HPFJ ASYNC(Z) REDISTRIBUTE D(CYCLIC)

END SUBROUTINE

SUBROUTINE ASYNCWAIT_SUB

!HPFJ ASYNCWAIT(Z)

END SUBROUTINE

END

Anasynchronousidenti erandobjectcannotbepassedbetweenpro ceduresviaargument

asso ciation.

Rationale. Sincetheasynchronousidenti erisnotadataobject, itcannotb epassed

b etween pro cedures by an argument. The asynchronous object variable cannot b e

referenced asan actualargument. (See Section4.1.2.4.)

Sinceadummyargumentismadeunde nedbyendingtheexecutionoftheprocedure,

theasynchronoustransfer using adummy argumentas anasynchronousobjectmust

b emadetowaitinthesamepro cedure. Theusercannotwriteaprogramthatwaitsfor

transfer toanactualargument afterreturning from a pro cedure becausethedummy

and actualargumentsarenotguaranteed tob elo cated inthesame storage. (End of 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(32)

Example. Example ofincorrect program

 Mo dule

MODULE MOO

!HPFJ ASYNCID ID1

REAL B(50,100)

END

 Callerprogram

USE MOO

REAL A(100,100)

...

CALL FOO(A(1:50,:))

!HPFJ ASYNCWAIT (ID1) ! Waits for asynchronous transfer to A.

...

 Subroutine

SUBROUTINE FOO(X)

REAL X(50,100)

...

!HPFJ ASYNCHRONOUS (ID1)

X=B

RETURN ! Prohibited: The allocation of dummy

! argument X is made undefined here.

END

4.1.5.2 Asynchronous transfer between di erent calls for the same sub-

program

When the ASYNCHRONOUS and ASYNCWAIT directives are in the same subprogram

andarenotexecutedinthesameinstance(forexample,theprogramstartstheasynchronous

transfer by the rst call and waits for the end of asynchronoustransfer bythe next call),

theallocationofanasynchronousidenti erand objectmustnotbemadeunde ned during

asynchronous transfer. In this case, as described in Section 4.1.5.1, globally declare an

asynchronousidenti erand objectordeclare them withtheSAVEattribute.

Example. In thefollowingcase,since an asynchronousobjectA and asynchronousiden-

ti erID mustnotbe made unde ned after thesubroutine ends, de ne a SAVE dec-

laration.

Caller program

Calla subroutineN times.

DO I=1,N

CALL PIPELINETRANS(I,N)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(33)

END

Subroutine

ObtainvariableATMPfromvariableAandreturnthevaluefromATMPtoAbythe

asynchronoustransfer until thenextcall.

SUBROUTINE PIPELINETRANS(NTIMES,NEND)

REAL A(1000),ATMP(1000)

SAVE A

!HPFJ ASYNCID,SAVE :: Z

! --- Waits at the second and subsequent calls.

IF(NTIMES>1) THEN

!HPFJ ASYNCWAIT(Z)

END IF

! --- Obtains ATMP from A.

DO I=2,999

ATMP(I)=0.25*(A(I-1)+2*A(I)+A(I+1))

END DO

! --- Starts the transfer at call excluding the last.

IF(NTIMES<NEND) THEN

!HPFJ ASYNCHRONOUS(Z)

A(2:999)=ATMP(2:999)

END IF

! --- Returns to call while executing the

! asynchronous transfer.

RETURN

END

4.1.5.3 Asynchronous transfer in recursive pro cedure

In a recursiveprocedure,the asynchronoustransfer is p erformedbytwomethods: in each

instance (see(a)inFigure 4.1)and crossinginstances(see (b)inFigure4.1). Intheformer

case,anasynchronousidenti erandobjectmustb edeclaredinthepro cedurewithoutSAVE

attribute. Inthe lattercase,an asynchronousidenti erand objectmustb edeclaredin the

pro cedure withSAVE attributeorgloballyinthemodule.

4.1.5.4 Notes on asynchronous remapping

For asynchronous redistribution, not only the distributee itself but also variables aligned

with the distributee are regarded as asynchronous objects. Note that all asynchronous

objects maynotb emadeunde ned during asynchronoustransfer.

Also note that the ultimate align target may not b e made unde ned for asynchronous

realignmentduringasynchronous transfer.

Example. Example ofincorrect program

Since variableA alignedto Dina subroutineis made unde nedduring asynchronous

transfer, thelanguage pro cessor cannot assure the op eration. In this case, movethe 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(34)

PROGRAM MAIN

CALL RSUB

SUBROUTINE RSUB

ASYNC(id)

CALL RSUB

SUBROUTINE RSUB

ASYNC(id)

CALL RSUB

WAIT(id)

SUBROUTINE RSUB

WAIT(id)

ASYNC(id)

WAIT(id)

(a)Asynchronoustransfer closedforeachinstance

PROGRAM MAIN

CALL RSUB

SUBROUTINE RSUB

ASYNC(id)

CALL RSUB

SUBROUTINE RSUB

ASYNC(id)

CALL RSUB

WAIT(id)

SUBROUTINE RSUB

WAIT(id)

(b)Asynchronous transfercrossinginstances

Figure4.1: Asynchronoustransfer inrecursiveprocedure

 Mo dule

MODULE MODD

REAL D(1000)

!HPF$ DISTRIBUTE(BLOCK),DYNAMIC :: D

!HPFJ ASYNCID :: ZZ

END MODULE

 Callerprogram

USE MODD

...

CALL MISDIST

!HPFJ ASYNCWAIT(ID=ZZ) ! Waits for the redistribution

... ! of D.

 Subroutine

SUBROUTINE MISDIST

USE MODD

REAL A(1000)

!HPF$ ALIGN(:) WITH D(:), DYNAMIC :: A ! Aligns A to D.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(35)

!HPFJ ASYNCHRONOUS(ID=ZZ)

!HPF$ REDISTRIBUTE(CYCLIC) :: D ! A and D are object variables.

RETURN ! Prohibited: Local variable A

END ! is made undefined.

4.2 Extension of SHADOW Directive

Thissection explainsthe extensionof SHADOWdirective.

Inthecase of thefollowingexample;

Example.

REAL A(4,4)

!HPF$ PROCESSORS P(2,2)

!HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO P

ManyHPF languagepro cessorsallo cate only alocalpart ofthe wholedeclared arrayarea

ontoeachpro cessor. (SeeFigure 4.2)

A(1,1) A(1,2)

A(2,1) A(2,2)

P(1,1)

A(1,3) A(1,4)

A(2,3) A(2,4)

P(1,2)

A(3,1) A(3,2)

A(4,1) A(4,2)

P(2,1)

A(3,3) A(3,4)

A(4,3) A(4,4)

P(2,2)

Figure4.2: Normal allo cation

On the other hand, an HPF language pro cessor can allocate the whole declared array

area onto each pro cessor by extending the SHADOW directive so that an asterisk * can be

sp eci ed ineachdimensionof shadow-target.(SeeFigure 4.3) 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

(36)

REAL A(4,4)

!HPF$ PROCESSORS P(2,2)

!HPF$ DISTRIBUTE A(BLOCK,BLOCK) ONTO P

!HPFJ SHADOW A(*,*) ! Extended SHADOW directive.

TheSHADOWattribute oftheobjectforwhichSHADOWdirective issp eci ed inthis format

issp ecially calledthefull SHADOWattribute.

A(1,1) A(1,2)

A(2,1) A(2,2)

P(1,1) P(1,2)

P(2,1) P(2,2)

A(1,3) A(1,4)

A(2,3) A(2,4)

A(3,1) A(3,2)

A(4,1) A(4,2)

A(3,3) A(3,4)

A(4,3) A(4,4)

Figure 4.3: Allo cation withfullSHADOWattribute

Usingthis allo cation metho d:

 Memoryusage eciencyis p o or.

Therefore, thesizeof usabledataislimited. Ontheother hand,an objectprogramcanb e

executed athighspeedbythefollowingtwofeatures:

 Thelanguagepro cessorneednotperformsubscriptconversionfromglobalsubscriptto

lo calsubscript,atthereferenceofanarraywhenthefullSHADOWattributeisstatically 1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

Table 3.1: Computation for each reduction kind
Figure 4.1: Asynchronous transfer in recursive procedure
Figure 4.2: Normal allo cation
Figure 4.3: Allo cation with full SHADOW attribute
+5

参照

関連したドキュメント

The inclusion of the cell shedding mechanism leads to modification of the boundary conditions employed in the model of Ward and King (199910) and it will be

W ang , Global bifurcation and exact multiplicity of positive solu- tions for a positone problem with cubic nonlinearity and their applications Trans.. H uang , Classification

It is suggested by our method that most of the quadratic algebras for all St¨ ackel equivalence classes of 3D second order quantum superintegrable systems on conformally flat

In this paper, we study the generalized Keldys- Fichera boundary value problem which is a kind of new boundary conditions for a class of higher-order equations with

Answering a question of de la Harpe and Bridson in the Kourovka Notebook, we build the explicit embeddings of the additive group of rational numbers Q in a finitely generated group

Next, we prove bounds for the dimensions of p-adic MLV-spaces in Section 3, assuming results in Section 4, and make a conjecture about a special element in the motivic Galois group

Transirico, “Second order elliptic equations in weighted Sobolev spaces on unbounded domains,” Rendiconti della Accademia Nazionale delle Scienze detta dei XL.. Memorie di

“rough” kernels. For further details, we refer the reader to [21]. Here we note one particular application.. Here we consider two important results: the multiplier theorems