Upgrading Privileged-Code Applications on OpenVMS Alpha and OpenVMS I64 Systems

Contents

Index

2.1.2 Changes Not Identified by Warning Messages

A few necessary source changes might not always be immediately identified by compile-time or link-time warnings. Some of these are:

Pointers to a PFN database entry are now 64-bits wide. If you save or restore them, you must preserve the full 64 bits of these pointers.
The MMG[_STD]$SVAPTECHK routine can handle only S0/S1 addresses. If you pass it an address in any other space, such as P0, it will declare a bugcheck.
The various SCH$ routines that put a process (now kernel thread) into a wait state now require the KTB instead of the PCB. (This is not a 64-bit change, but it could affect drivers OpenVMS Alpha Version 7.0 device drivers.)

2.2 I/O Changes

This section describes OpenVMS Alpha Version 7.0 changes to the I/O subsystem that might require source changes to device drivers.

2.2.1 Impact of IRPE Data Structure Changes

As described in Section A.9, the I/O Request Packet Extension (IRPE) structure now manages a single additional locked-down buffer instead of two. The general approach to deal with this change is to use a chain of additional IRPE structures.

Current users of the IRPE may be depending on the fact that a buffer locked for direct I/O could be fully described by the irp$l_svapte , irp$l_boff , and irp$l_bcnt values. For example, it is not uncommon for an IRPE to be used in this fashion:

The second buffer that will be eventually associated with the IRPE is locked first by calling EXE_STD$READLOCK with the IRP.
The irp$l_svapte , irp$l_boff , and irp$l_bcnt values are copied from the IRP into the IRPE. The irp$l_svapte cell is then cleared. The locked region is now completely described by the IRPE.
The first buffer is locked by calling EXE_STD$READLOCK with the IRP again.
A driver-specific error callback routine is required for the EXE_STD$READLOCK calls. This error routine calls MMG_STD$UNLOCK to unlock any region associated with the IRP and deallocates the IRPE.

This approach no longer works correctly. As described in Appendix A, the DIOBM structure that is embedded in the IRP will be needed as well. Moreover, it may not be sufficient to simply copy the DIOBM from the IRP to the IRPE. In particular, the irp$l_svapte may need to be modified if the DIOBM is moved.

The general approach to this change is to lock the buffer using the IRPE directly. This approach is shown in some detail in the following example:

irpe->irpe$b_type = DYN$C_IRPE; (1) irpe->irpe$l_driver_p0 = (int) irp; (2) status = exe_std$readlock( irp, pcb, ucb, ccb, (3) buf1, buf1_len, lock_err_rtn (4) ); if( !$VMS_STATUS_SUCCESS(status) ) return status; irpe->irpe$b_rmod = irp->irp$b_rmod; (5) status = exe_std$readlock( (IRP *)irpe, pcb, ucb, ccb, (6) buf2, buf2_len, lock_err_rtn ); if( !$VMS_STATUS_SUCCESS(status) ) return status;

The IRPE needs to be explicitly identified as an IRPE because the error callback routine depends on being able to distinguish an IRP from an IRPE.
The IRPE needs to contain a pointer to the original IRP for this I/O request for potential use by the error callback routine. Here, a driver-specific cell in the IRPE is used.
The first buffer is locked using the IRP.
If EXE_STD$READLOCK cannot lock the entire buffer into memory, the following occurs:
1. The error callback routine, LOCK_ERR_RTN, is invoked.
2. Depending on the error status, either the I/O is aborted or backed out for automatic retry. In any event, the IRP is deallocated.
3. EXE_STD$READLOCK returns the SS$_FDT_COMPL warning status.
The caller's access mode must be copied into the IRPE in preparation for locking the second buffer using the IRPE.
The second buffer is locked using the IRPE. If this fails, the error callback routine LOCK_ERR_RTN is called with the IRPE.

This approach is easily generalized to more buffers and IRPEs. The only thing omitted from this example is the code that allocates and links together the IRPEs. The following example shows the associated error callback routine in its entirety; it can handle an arbitrary number of IRPEs.

void lock_err_rtn (IRP *const lock_irp, (1) PCB *const pcb, UCB *const ucb, CCB *const ccb, const int errsts, IRP **real_irp_p (2) ) { IRP *irp; if( lock_irp->irp$b_type == DYN$C_IRPE ) irp = (IRP *) ((IRPE *)lock_irp)->irpe$l_driver_p0; (3) else irp = lock_irp; exe_std$lock_err_cleanup (irp); (4) *real_irp_p = irp; (5) return; }

The lock_irp parameter can be either an IRP or an IRPE, depending on the data structure that was used with EXE_STD$READLOCK.
Before returning from this error callback routine, you must provide the original IRP via the real_irp_p parameter so that the I/O can be properly terminated.
If this routine has been passed an IRPE, a pointer to the original IRP from the irpe$l_driver_p0 cell is obtained because it was explicitly placed there.
The new EXE_STD$LOCK_ERR_CLEANUP routine does all the needed unlocking and deallocation of IRPEs.
Provide the address of the original IRP to the caller.

2.2.2 Impact of MMG_STD$IOLOCK, MMG_STD$UNLOCK Changes

The interface changes to the MMG_STD$IOLOCK and MMG_STD$UNLOCK routines are described in Appendix B. The general approach to these changes is to use the corresponding replacement routines and the new DIOBM structure.

2.2.2.1 Direct I/O Functions

OpenVMS device drivers that perform data transfers using direct I/O functions do so by locking the buffer into memory while still in process context, that is, in a driver FDT routine. The PTE address of the first page that maps the buffer is obtained and the byte offset within the page to the start of the buffer is computed. These values are saved in the IRP ( irp$l_svapte and irp$l_boff ). The rest of the driver then uses values in the irp$l_svapte and irp$l_boff cells and the byte count in irp$l_bcnt in order to perform the transfer. Eventually when the transfer has completed and the request returns to process context for I/O postprocessing, the buffer is unlocked using the irp$l_svapte value and not the original process buffer address.

To support 64-bit addresses on a direct I/O function, one only needs to ensure the proper handling of the buffer address within the FDT routine.

Almost all device drivers that perform data transfers via a direct I/O function use OpenVMS-supplied FDT support routines to lock the buffer into memory. Because these routines obtain the buffer address either indirectly from the IRP or directly from a parameter that is passed by value, the interfaces for these routines can easily be enhanced to support 64-bit wide addresses.

However, various OpenVMS Alpha memory management infrastructure changes made to support 64-bit addressing have a potentially major impact on the use of the 32-bit irp$l_svapte cell by device drivers prior to OpenVMS Alpha Version 7.0. In general, there are two problems:

It takes a full 64-bits to address a process PTE in page table space, and,
The 64-bit page table space address for a process PTE is only valid when in the context of that process. (This is also known as the "cross-process PTE problem.")

In most cases, both of these PTE access problems are solved by copying the PTEs that map the buffer into nonpaged pool and setting irp$l_svapte to point to the copies. This copy is done immediately after the buffer has been successfully locked. A copy of the PTE values is acceptable because device drivers only read the PTE values and are not allowed to modify them. These PTE copies are held in a new nonpaged pool data structure, the Direct I/O Buffer Map (DIOBM) structure. A standard DIOBM structure (also known as a fixed-size primary DIOBM) contains enough room for a vector of 9 (DIOBM$K_PTECNT_FIX) PTE values. This is sufficient for a buffer size up to 64K bytes on a system with 8 KB pages.¹ It is expected that most I/O requests are handled by this mechanism and that the overhead to copy a small number of PTEs is acceptable, especially given that these PTEs have been recently accessed to lock the pages.

The standard IRP contains an embedded fixed-size DIOBM structure. When the PTEs that map a buffer fit into the embedded DIOBM, the irp$l_svapte cell is set to point to the start of the PTE copy vector within the embedded DIOBM structure in that IRP.

If the buffer requires more than 9 PTEs, then a separate "secondary" DIOBM structure that is variably-sized is allocated to hold the PTE copies. If such a secondary DIOBM structure is needed, it is pointed to by the original, or "primary" DIOBM structure. The secondary DIOBM structure is deallocated during I/O postprocessing when the buffer pages are unlocked. In this case, the irp$l_svapte cell is set to point into the PTE vector in the secondary DIOBM structure. The secondary DIOBM requires only 8 bytes of nonpaged pool for each page in the buffer. The allocation of the secondary DIOBM structure is not charged against the process BYTLM quota, but it is controlled by the process direct I/O limit (DIOLM). This is the same approach used for other internal data structures that are required to support the I/O, including the kernel process block, kernel process stack, and the IRP itself.

Note

¹ Eight PTEs are sufficient only if the buffer begins exactly on a page boundary, otherwise a 9th is required.