|
HP OpenVMS systems documentation |
Previous | Contents | Index |
When preserving atomicity, the compiler must assume the modify data is aligned. An update of a field spanning a quadword boundary cannot occur atomically since this would require two read-modify-write sequences.
On OpenVMS Alpha systems, since software cannot handle an unaligned LDx_L or STx_C instruction as it can a normal load or store instruction, a LDx_L or STx_C instruction to an unaligned address will generate a fatal reserved operand fault.
On OpenVMS I64 systems, since software cannot handle an unaligned address in the compare-exchange (cmpxchg) instruction, it will generate an exception at run time.
On OpenVMS Alpha systems, when /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified, an INCL (R1) instruction generates LDL_L and STL_C instructions so R1 must be longword aligned.
Assume the following instruction:
INCW (R1) |
For this instruction, the compiler generates a code sequence such as the following on OpenVMS Alpha systems:
BIC R1,#^B0110,R28 ; Compute Aligned Address Retry: LDQ_L R24,(R28) ; Load the QW with the data EXTWL R24,R1,R23 ; Extract out the Word ADDL R23,#1,R23 ; Increment the Word INSWL R23,R1,R23 ; Correctly position the Word MSKWL R24,R1,R24 ; Zero the spot for the Word BIS R23,R24,R23 ; Combine Original and New word STQ_C R23,(R28) ; Conditionally store result BEQ fail ; Branch ahead on failure . . . fail: BR Retry |
An INCB instruction uses #^B0111 to generate the aligned address since all bytes are aligned.
For the INCW (R1) instruction, the compiler generates a code sequence such as the following on OpenVMS I64 systems:
$L5: ld2 r19 = [r9] mov.m apccv = r19 mov r18 = r19 sxt2 r19 = r19 adds r19 = 1, r19 cmpxchg2.acq r19, [r9] = r19 cmp.eq pr0, pr8 = r18, r19 (pr8) br.cond.dpnt.few $L5 |
The compiler's methods of preserving atomicity have an interesting side effect in compiled VAX MACRO code.
On OpenVMS VAX systems, only the interlocked instructions will work correctly to synchronize access to shared data in multiprocessor systems. On OpenVMS Alpha multiprocessing systems, the code resulting from a compilation of modify instructions (with atomicity preserved) and interlocked instructions would both work correctly, because the LDx_L and STx_C which the compiler generates for both sets of instructions operate correctly across multiple processors. Likewise, on OpenVMS I64 systems, the the compare-exchange (cmpxchg) instruction provides interlocking across processors.
Because this compiler side effect is specific to OpenVMS Alpha and OpenVMS I64 systems and does not port back to OpenVMS VAX systems, you should avoid relying on it when porting VAX MACRO code to OpenVMS Alpha or OpenVMS I64 if you intend to run the code on both systems.
However, interlocked instructions must still be used if the memory modification is being used as an interlock for other instructions for which atomicity is not preserved. This is because the Alpha and and Itanium architectures do not guarantee strict write ordering.
For example, consider the following VAX MACRO code sequence:
.PRESERVE ATOMICITY INCL (R1) .NOPRESERVE ATOMICITY MOVL (R2),R3 |
This code sequence will generate the following Alpha code sequence:
Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, fail LDL R3, (R2) . . . fail: BR Retry |
Because of the data prefetching of the Alpha and Itanium architectures, the data from (R2) may be read before the store to (R1) is processed. If the INCL (R1) instruction is being used as a lock to prevent the data at (R2) from being accessed before the lock is set, the read of (R2) may occur before the increment of (R1) and thus is not protected.
The VAX interlocked instructions generate Alpha MB (memory barrier) or Itanium mf (memory fence) instructions before and after the interlocked sequence. This prevents memory loads from being moved across the interlocked instruction.
On OpenVMS I64, the code sequence would be similar to the following:
$L7: ld4 r16 = [r9] mov.m apccv = r16 mov r15 = r16 sxt4 r16 = r16 adds r16 = 1, r16 cmpxchg4.acq r16, [r9] = r16 cmp.eq pr0, pr10 = r15, r16 (pr10) br.cond.dpnt.few $L7 ld4 r3 = [r28] sxt4 r3 = r3 |
Consider the following code sequence:
ADAWI #1,(R1) MOVL (R2),R3 |
This code sequence will generate the following Alpha code sequence:
MB Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, Fail MB LDL R3, (R2) . . . Fail: BR Retry |
On OpenVMS I64, a code sequence similar to the following would be generated:
mf $L8: ld2 r23 = [r9] mov.m apccv = r23 adds r24 = 1, r23 cmpxchg2.acq r14, [r9] = r24 cmp.eq pr0, pr11 = r23, r14 (pr11) br.cond.dpnt.few $L8 mf ld4 r3 = [r28] sxt4 r3 = r3 |
The MB or mf instructions cause all memory operations before the MB or
mf instruction to complete before any memory operations after the MB or
mf instruction are allowed to begin.
2.12 Compiling and Linking
The compiler requires the following files:
For information about compiler qualifiers, see Appendix A.
2.12.1 Line Numbering in Listing File
The macro expansion line numbering scheme in the listing file is Xnn/mmm, where Xnn shows the nesting depth and mmm is the line number relative to the outermost macro.
Example 2-1 shows an OpenVMS I64 listing file. The source portion of an OpenVMS Alpha listing file is essentially the same.
Example 2-1 Example of Line Numbering in an OpenVMS I64 Listing File |
---|
00000000 1 ; 00000000 2 ; This is the Itanium (previously called "IA-64") version of 00000000 3 ; ARCH_DEFS.MAR, which contains architectural definitions for 00000000 4 ; compiling VMS sources for VAX, Alpha, and I64 systems. 00000000 5 ; 00000000 6 ; Note: VAX, VAXPAGE, and IA64 should be left undefined, 00000000 7 ; a lot of code checks for whether a symbol is 00000000 8 ; defined (e.g. .IF DF VAX) vs. whether the value 00000000 9 ; is of a expected value (e.g. .IF NE VAX). 00000000 10 ; 00000000 11 ;VAX = 0 00000000 12 ;EVAX = 0 00000000 13 ;ALPHA = 0 00000001 00000000 14 IA64 = 1 00000000 15 ; 00000000 16 ;VAXPAGE = 0 00000001 00000000 17 BIGPAGE = 1 00000000 18 ; 00000020 00000000 19 ADDRESSBITS = 32 00000000 20 .TITLE ug_ex_listing /line numbering in the listing file/ 00000000 21 ; 00000000 22 .MACRO test1 00000000 23 clrl r1 00000000 24 clrl r2 00000000 25 tstl 48(sp) ; generate uplevel stack error 00000000 26 clrl r3 00000000 27 .ENDM test1 00000000 28 .MACRO test2 00000000 29 clrl r4 00000000 30 clrl r5 00000000 31 test1 00000000 32 clrl r6 00000000 33 .ENDM test2 00000000 34 00000000 35 foo: .jsb_entry 00000000 56 .show expansions 00000000 57 clrl r0 00000011 58 test2 1....... %IMAC-E-UPLEVSTK, (1) up-level stack reference in routine FOO X01/001 00000002 clrl r4 X01/002 00000004 clrl r5 X01/003 00000006 test1 X02/004 00000006 clrl r1 X02/005 00000008 clrl r2 X02/006 0000000A tstl 48(sp) ; generate uplevel stack error X02/007 0000000D clrl r3 X02/008 0000000F X01/009 0000000F clrl r6 X01/010 00000011 00000011 59 rsb 00000012 60 .noshow expansions 00000012 61 00000012 62 .END |
The compiler provides full debugger support. The debug session for
compiled VAX MACRO code is similar to that for assembled VAX MACRO
code. However, there are some important differences that are described
in this section. For a complete description of debugging, see the
HP OpenVMS Debugger Manual.
2.13.1 Code Relocation
One major difference is that the code is compiled rather than assembled. On an OpenVMS VAX system, each VAX MACRO instruction is a single machine instruction. On an OpenVMS Alpha or OpenVMS I64 system, each VAX MACRO instruction may be compiled into many Alpha or Itanium machine instructions. A major side effect of this difference is the relocation and rescheduling of code if you do not specify /NOOPTIMIZE in your compile command.
By default, several optimizations are performed that cause the movement
of generated code across source boundaries (see Section 1.2,
Section 4.3, and Appendix A). For most code modules, debugging is
simplified if you compile with /NOOPTIMIZE, which prevents this
relocation from happening. After you have debugged your code, you can
recompile without /NOOPTIMIZE to improve performance.
2.13.2 Symbolic Variables for Routine Arguments
Another major difference between debugging compiled code and debugging assembled code is a new concept to VAX MACRO, the definition of symbolic variables for examining routine arguments. On OpenVMS VAX systems, when you are debugging a routine and want to examine the arguments, you typically do something like the following:
DBG> EXAMINE @AP ; to see the argument count DBG> EXAMINE @AP+4 ; to examine the first arg |
or
DBG> EXAMINE @AP ; to see arg count DBG> EXAMINE .+4:.+20 ; to see first 5 args |
On OpenVMS Alpha and OpenVMS I64 systems, the arguments do not reside in a vector in memory as they do on OpenVMS VAX systems. Furthermore, there is no AP register on OpenVMS Alpha and OpenVMS I64 systems. If you type EXAMINE @AP when debugging VAX MACRO compiled code, the debugger reports that AP is an undefined symbol.
In the compiled code, the arguments can reside in some combination of:
The compiler does not require that you figure out where the arguments
are by reading the generated code. Instead, it provides $ARGn
symbols that point to the correct argument locations. The $ARG0 symbol
is the same as @AP+0 is on VAX systems, that is, the argument count.
The $ARG1 symbol is the first argument, $ARG2 is the second argument,
and so forth. These symbols are defined in CALL_ENTRY and JSB_ENTRY
directives, but not in EXCEPTION_ENTRY directives.
2.13.3 Locating Arguments Without $ARGn Symbols
There may be additional arguments in your code for which the compiler did not generate a $ARGn symbol. The number of $ARGn symbols defined for a .CALL_ENTRY routine is the maximum number detected by the compiler (either by automatic detection or as specified by MAX_ARGS). For a .JSB_ENTRY routine, since the arguments are homed in the caller's stack frame and the compiler cannot detect the actual number, it always creates eight $ARGn symbols.
In most cases, you can easily find any additional arguments, but in
some cases you cannot.
2.13.3.1 Additional Arguments That Are Easy to Locate
You can easily find additional arguments if:
For example, you can examine arguments beyond the eighth argument in a JSB routine (where the argument list must be homed in the caller), as follows:
DBG> EX $ARG8 ; highest defined $ARGn . . . DBG> EX .+4 ; next arg is in next longword . . . DBG> EX .+4 ; and so on |
This example assumes that the caller detected at least 10 arguments when homing the argument list.
To find arguments beyond the last $ARGn symbol in a routine
that did not home the arguments, proceed exactly as in the previous
example except substitute EX .+8 for EX .+4.
2.13.3.2 Additional Arguments That Are Not Easy to Locate
You cannot easily find additional arguments if:
The only way to find the additional arguments in these cases is to
examine the compiled machine code to determine where the arguments
reside. Both of these problems are eliminated if MAX_ARGS is specified
correctly for the maximum argument that you want to examine.
2.13.4 Using VAX and Alpha Register Names on OpenVMS I64
For convenience, the MACRO compiler on OpenVMS I64 defines symbols
named R0, R1, ... R31 to refer to the Itanium registers where
those Alpha register values reside. You can still use the debugger's
names %R0, %R1, ... %R31 to refer to registers by the native machine's
numbering.
2.13.5 Debugging Code with Packed Decimal Data
Keep this information in mind when debugging compiled VAX MACRO code with packed decimal data on an OpenVMS Alpha or OpenVMS I64 system:
Keep this information in mind when debugging compiled VAX MACRO code with floating-point data on an OpenVMS Alpha or OpenVMS I64 system:
EXAMINE/G_FLOAT R4 |
MOVG DATA, R6 |
DBG> EX R6 .MAIN.\%LINE 100\%R6: 0FFFFFFFF D8E640D1 DBG> EX R7 .MAIN.\%LINE 100\%R7: 00000000 2F1B24DD DBG> DEP R0 = 2F1B24DDD8E640D1 DBG> EX/G_FLOAT R0 .MAIN.\%LINE 100\%R0: 4568.89900000000 |
Previous | Next | Contents | Index |