Document revision date: 15 July 2002 | |
Previous | Contents | Index |
Table A-2 lists system parameters that should not require adjustment at any time. These parameters are provided for use in system debugging. Compaq recommends that you do not change these parameters unless you are advised to do so by your Compaq support representative. Incorrect adjustment of these parameters can result in cluster failures.
Parameter | Description |
---|---|
++MC_SERVICES_P1 (dynamic) | The value of this parameter must be the same on all nodes connected by MEMORY CHANNEL. |
++MC_SERVICES_P5 (dynamic) | This parameter must remain at the default value of 8000000. This parameter value must be the same on all nodes connected by MEMORY CHANNEL. |
++MC_SERVICES_P8 (static) | This parameter must remain at the default value of 0. This parameter value must be the same on all nodes connected by MEMORY CHANNEL. |
++MPDEV_D1 | A multipath system parameter. |
PAMAXPORT |
PAMAXPORT specifies the maximum port number to be polled on each CI and
DSSI. The CI and DSSI port drivers poll to discover newly initialized
ports or the absence or failure of previously responding remote ports.
A system will not detect the existence of ports whose port numbers are higher than this parameter's value. Thus, this parameter should be set to a value that is greater than or equal to the highest port number being used on any CI or DSSI connected to the system. You can decrease this parameter to reduce polling activity if the hardware configuration has fewer than 16 ports. For example, if the CI or DSSI with the largest configuration has a total of five ports assigned to port numbers 0 through 4, you could set PAMAXPORT to 4. If no CI or DSSI devices are configured on your system, this parameter is ignored. The default for this parameter is 15 (poll for all possible ports 0 through 15). Compaq recommends that you set this parameter to the same value on each cluster computer. |
PANOPOLL |
Disables CI and DSSI polling for ports if set to 1. (The default is 0.)
When PANOPOLL is set, a computer will not promptly discover that
another computer has shut down or powered down and will not discover a
new computer that has booted. This parameter is useful when you want to
bring up a computer detached from the rest of the cluster for debugging
purposes.
PANOPOLL is functionally equivalent to uncabling the system from the DSSI or star coupler. This parameter does not affect OpenVMS Cluster communications over the LAN. The default value of 0 is the normal setting and is required if you are booting from an HSC controller or if your system is joining an OpenVMS Cluster. This parameter is ignored if there are no CI or DSSI devices configured on your system. |
PANUMPOLL |
Establishes the number of CI and DSSI ports to be polled during each
polling interval. The normal setting for PANUMPOLL is 16.
On older systems with less powerful CPUs, the parameter may be useful in applications sensitive to the amount of contiguous time that the system spends at IPL 8. Reducing PANUMPOLL reduces the amount of time spent at IPL 8 during each polling interval while increasing the number of polling intervals needed to discover new or failed ports. If no CI or DSSI devices are configured on your system, this parameter is ignored. |
PAPOLLINTERVAL |
Specifies, in seconds, the polling interval the CI port driver uses to
poll for a newly booted computer, a broken port-to-port virtual
circuit, or a failed remote computer.
This parameter trades polling overhead against quick response to virtual circuit failures. This parameter should be set to the same value on each cluster computer. |
PAPOOLINTERVAL |
Specifies, in seconds, the interval at which the port driver checks
available nonpaged pool after a pool allocation failure.
This parameter trades faster response to pool allocation failures for increased system overhead. If CI or DSSI devices are not configured on your system, this parameter is ignored. |
PASANITY |
PASANITY controls whether the CI and DSSI port sanity timers are
enabled to permit remote systems to detect a system that has been hung
at IPL 8 or higher for 100 seconds. It also controls whether virtual
circuit checking gets enabled on the local system. The TIMVCFAIL
parameter controls the time (1--99 seconds).
PASANITY is normally set to 1 and should be set to 0 only if you are debugging with XDELTA or planning to halt the CPU for periods of 100 seconds or more. PASANITY is only semidynamic. A new value of PASANITY takes effect on the next CI or DSSI port reinitialization. If CI or DSSI devices are not configured on your system, this parameter is ignored. |
PASTDGBUF |
The number of datagram receive buffers to queue initially for each CI
or DSSI port driver's configuration poller; the initial value is
expanded during system operation, if needed.
If no CI or DSSI devices are configured on your system, this parameter is ignored. |
PASTIMOUT |
The basic interval at which the CI port driver wakes up to perform
time-based bookkeeping operations. It is also the period after which a
timeout will be declared if no response to a start handshake datagram
has been received.
If no CI or DSSI device is configured on your system, this parameter is ignored. |
PRCPOLINTERVAL |
Specifies, in seconds, the polling interval used to look for SCS
applications, such as the connection manager and MSCP disks, on other
computers. Each computer is polled, at most, once each interval.
This parameter trades polling overhead against quick recognition of new computers or servers as they appear. |
SCSMAXMSG |
The maximum number of bytes of system application data in one sequenced
message. The amount of physical memory consumed by one message is
SCSMAXMSG plus the overhead for buffer management.
If an SCS port is not configured on your system, this parameter is ignored. |
SCSMAXDG |
Specifies the maximum number of bytes of application data in one
datagram.
If an SCS port is not configured on your system, this parameter is ignored. |
SCSFLOWCUSH |
Specifies the lower limit for receive buffers at which point SCS starts
to notify the remote SCS of new receive buffers. For each connection,
SCS tracks the number of receive buffers available. SCS communicates
this number to the SCS at the remote end of the connection. However,
SCS does not need to do this for each new receive buffer added.
Instead, SCS notifies the remote SCS of new receive buffers if the
number of receive buffers falls as low as the SCSFLOWCUSH value.
If an SCS port is not configured on your system, this parameter is ignored. |
This appendix provides guidelines for building a common user authorization file (UAF) from computer-specific files. It also describes merging RIGHTSLIST.DAT files.
For more detailed information about how to set up a computer-specific
authorization file, see the descriptions in the OpenVMS Guide to System Security.
B.1 Building a Common SYSUAF.DAT File
To build a common SYSUAF.DAT file, follow the steps in Table B-1.
Step | Action |
---|---|
1 |
Print a listing of SYSUAF.DAT on each computer. To print this listing,
invoke AUTHORIZE and specify the AUTHORIZE command LIST as follows:
$ SET DEF SYS$SYSTEM |
2 |
Use the listings to compare the accounts from each computer. On the
listings, mark any necessary changes. For example:
|
3 |
Choose the SYSUAF.DAT file from one of the computers to be a master
SYSUAF.DAT.
Note: The default values for a number of SYSUAF process limits and quotas are higher on an Alpha computer than they are on a VAX computer. See A Comparison of System Management on OpenVMS AXP and OpenVMS VAX 1 for information about setting values on both computers. |
4 |
Merge the SYSUAF.DAT files from the other computers to the master
SYSUAF.DAT by running the Convert utility (CONVERT) on the computer
that owns the master SYSUAF.DAT. (See the OpenVMS Record Management Utilities Reference Manual for a
description of CONVERT.) To use CONVERT to merge the files, each
SYSUAF.DAT file must be accessible to the computer that is running
CONVERT.
Syntax: To merge the UAFs into the master SYSUAF.DAT
file, specify the CONVERT command in the following format:
Note that if a given user name appears in more than one source file, only the first occurrence of that name appears in the merged file.
Example: The following command sequence example
creates a new SYSUAF.DAT file from the combined contents of the two
input files:
The CONVERT command in this example adds the records from the files [SYS1.SYSEXE]SYSUAF.DAT and [SYS2.SYSEXE]SYSUAF.DAT to the file SYSUAF.DAT on the local computer. After you run CONVERT, you have a master SYSUAF.DAT that contains records from the other SYSUAF.DAT files. |
5 | Use AUTHORIZE to modify the accounts in the master SYSUAF.DAT according to the changes you marked on the initial listings of the SYSUAF.DAT files from each computer. |
6 | Place the master SYSUAF.DAT file in SYS$COMMON:[SYSEXE]. |
7 | Remove all node-specific SYSUAF.DAT files. |
If you need to merge RIGHTSLIST.DAT files, you can use a command sequence like the following:
$ ACTIVE_RIGHTSLIST = F$PARSE("RIGHTSLIST","SYS$SYSTEM:.DAT") $ CONVERT/SHARE/STAT 'ACTIVE_RIGHTSLIST' RIGHTSLIST.NEW $ CONVERT/MERGE/STAT/EXCEPTION=RIGHTSLIST_DUPLICATES.DAT - _$ [SYS1.SYSEXE]RIGHTSLIST.DAT, [SYS2.SYSEXE]RIGHTSLIST.DAT RIGHTSLIST.NEW $ DUMP/RECORD RIGHTSLIST_DUPLICATES.DAT $ CONVERT/NOSORT/FAST/STAT RIGHTSLIST.NEW 'ACTIVE_RIGHTSLIST' |
The commands in this example add the RIGHTSLIST.DAT files from two OpenVMS Cluster computers to the master RIGHTSLIST.DAT file in the current default directory. For detailed information about creating and maintaining RIGHTSLIST.DAT files, see the security guide for your system.
This appendix contains information to help you perform troubleshooting operations for the following:
Before you initiate diagnostic procedures, be sure to verify that these conditions are met:
If, after performing preliminary checks and taking appropriate
corrective action, you find that a computer still fails to boot or to
join the cluster, you can follow the procedures in Sections
C.2 through C.4 to attempt recovery.
C.1.2 Sequence of Booting Events
To perform diagnostic and recovery procedures effectively, you must understand the events that occur when a computer boots and attempts to join the cluster. This section outlines those events and shows typical messages displayed at the console.
Note that events vary, depending on whether a computer is the first to boot in a new cluster or whether it is booting in an active cluster. Note also that some events (such as loading the cluster database containing the password and group number) occur only in OpenVMS Cluster systems on a LAN.
The normal sequence of events is shown in Table C-1.
Step | Action |
---|---|
1 |
The computer boots. If the computer is a satellite, a message like the
following shows the name and LAN address of the MOP server that has
downline loaded the satellite. At this point, the satellite has
completed communication with the MOP server and further communication
continues with the system disk server, using OpenVMS Cluster
communications.
%VAXcluster-I-SYSLOAD, system loaded from Node X...For any booting computer, the OpenVMS "banner message" is displayed in the following format: operating-system Version n.n dd-mmm-yyyy hh:mm.ss |
2 |
The computer attempts to form or join the cluster, and the following
message appears:
waiting to form or join an OpenVMS Cluster system If the computer is a member of an OpenVMS Cluster based on the LAN,
the cluster security database (containing the cluster password and
group number) is loaded. Optionally, the MSCP server and TMSCP server
can be loaded:
|
3 |
If the computer discovers a cluster, the computer attempts to join it.
If a cluster is found, the connection manager displays one or more
messages in the following format:
%CNXMAN, Sending VAXcluster membership request to system X... Otherwise, the connection manager forms the cluster when it has enough votes to establish quorum (that is, when enough voting computers have booted). |
4 |
As the booting computer joins the cluster, the connection manager
displays a message in the following format:
%CNXMAN, now a VAXcluster member -- system X... Note that if quorum is lost while the computer is booting, or if a
computer is unable to join the cluster within 2 minutes of booting, the
connection manager displays messages like the following:
The last two messages show any connections that have already been formed. |
5 |
If the cluster includes a quorum disk, you may also see messages like
the following:
%CNXMAN, Using remote access method for quorum disk The first message indicates that the connection manager is unable to access the quorum disk directly, either because the disk is unavailable or because it is accessed through the MSCP server. Another computer in the cluster that can access the disk directly must verify that a reliable connection to the disk exists. The second message indicates that the connection manager can access the quorum disk directly and can supply information about the status of the disk to computers that cannot access the disk directly. Note: The connection manager may not see the quorum disk initially because the disk may not yet be configured. In that case, the connection manager first uses remote access, then switches to local access. |
6 |
Once the computer has joined the cluster, normal startup procedures
execute. One of the first functions is to start the OPCOM process:
%%%%%%%%%%% OPCOM 15-JAN-1994 16:33:55.33 %%%%%%%%%%% |
7 |
As other computers join the cluster, OPCOM displays messages like the
following:
%%%%% OPCOM 15-JAN-1994 16:34:25.23 %%%%% (from node X...) |
As startup procedures continue, various messages report startup events.
Hint: For troubleshooting purposes, you can include in
your site-specific startup procedures messages announcing each phase of
the startup process---for example, mounting disks or starting queues.
C.2 Computer on the CI Fails to Boot
If a CI computer fails to boot, perform the following checks:
Step | Action | ||||||
---|---|---|---|---|---|---|---|
1 | Verify that the computer's SCSNODE and SCSSYSTEMID parameters are unique in the cluster. If they are not, you must either alter both values or reboot all other computers. | ||||||
2 | Verify that you are using the correct bootstrap command file. This file must specify the internal bus computer number (if applicable), the HSC or HSJ node number, and the disk from which the computer is to boot. Refer to your processor-specific installation and operations guide for information about setting values in default bootstrap command procedures. | ||||||
3 | Verify that the PAMAXPORT system parameter is set to a value greater than or equal to the largest CI port number. | ||||||
4 | Verify that the CI port has a unique hardware station address. | ||||||
5 | Verify that the HSC subsystem is on line. The ONLINE switch on the HSC operator control panel should be pressed in. | ||||||
6 | Verify that the disk is available. The correct port switches on the disk's operator control panel should be pressed in. | ||||||
7 |
Verify that the computer has access to the HSC subsystem. The SHOW
HOSTS command of the HSC SETSHO utility displays status for all
computers (hosts) in the cluster. If the computer in question appears
in the display as DISABLED, use the SETSHO utility to set the computer
to the ENABLED state.
Reference: For complete information about the SETSHO utility, consult the HSC hardware documentation. |
||||||
8 |
Verify that the HSC subsystem allows access to the boot disk. Invoke
the SETSHO utility to ensure that the boot disk is available to the HSC
subsystem. The utility's SHOW DISKS command displays the current state
of all disks visible to the HSC subsystem and displays all disks in the
no-host-access table.
|
Previous | Next | Contents | Index |
privacy and legal statement | ||
4477PRO_023.HTML |