hp.com home products and services support and drivers solutions how to buy
cd-rom home
End of Jump to page title
HP OpenVMS systems
documentation

Jump to content


Guidelines for OpenVMS Cluster Configurations

Guidelines for OpenVMS Cluster Configurations


Previous Contents Index

D.3.4 Guidelines for Configuring ATM and DS3 in an OpenVMS Cluster System

When configuring a multiple-site OpenVMS Cluster, you must ensure that the intersite link's delay, bandwidth, availability, and bit error rate characteristics meet application needs. This section describes the requirements and provides recommendations for meeting those requirements.

D.3.4.1 Requirements

To be a configuration approved by HP, a multiple-site OpenVMS Cluster must comply with the following rules:
Maximum intersite link route distance The total intersite link cable route distance between members of a multiple-site OpenVMS Cluster cannot exceed 150 miles (242 km). You can obtain exact distance measurements from your ATM or DS3 supplier.

This distance restriction can be exceeded when using Disaster Tolerant Cluster Services for OpenVMS, a system management and software package for configuring and managing OpenVMS disaster tolerant clusters.

Maximum intersite link utilization Average intersite link utilization in either direction must be less than 80% of the link's bandwidth in that direction for any 10-second interval. Exceeding this utilization is likely to result in intolerable queuing delays or packet loss.
Intersite link specifications The intersite link must meet the OpenVMS Cluster requirements specified in Table D-3.
OpenVMS Cluster LAN configuration rules Apply the configuration rules for OpenVMS Cluster systems on a LAN to a configuration. Documents describing configuration rules are referenced in Section D.1.3.

D.3.4.2 Recommendations

When configuring the DS3 interconnect, apply the configuration guidelines for OpenVMS Cluster systems interconnected by LAN that are stated in the OpenVMS Cluster Software SPD (SPD 29.78.nn) and in this manual. OpenVMS Cluster members at each site can include any mix of satellites, systems, and other interconnects, such as CI and DSSI.

This section provides additional recommendations for configuring a multiple-site OpenVMS Cluster system.

DS3 link capacity/protocols

The GIGAswitch with the WAN T3/SONET option card provides a full-duplex, 155 Mb/s ATM/SONET link. The entire bandwidth of the link is dedicated to the WAN option card. However, the GIGAswitch/FDDI's internal design is based on full-duplex extensions to FDDI. Thus, the GIGAswitch/FDDI's design limits the ATM/SONET link's capacity to 100 Mb/s in each direction.

The GIGAswitch with the WAN T3/SONET option card provides several protocol options that can be used over a DS3 link. Use the DS3 link in clear channel mode, which dedicates its entire bandwidth to the WAN option card. The DS3 link capacity varies with the protocol option selected. Protocol options are described in Table D-1.

Table D-1 DS3 Protocol Options
Protocol Option Link Capacity
ATM 1 AAL--5 2 mode with PLCP 3 disabled. 39 Mb/s
ATM AAL--5 mode with PLCP enabled. 33 Mb/s
HDLC 4 mode (not currently available). 43 Mb/s


1Asynchronous transfer mode
2ATM Adaptation Layer
3Physical Layer Convergence Protocol
4High-Speed Datalink Control

For maximum link capacity, HP recommends configuring the WAN T3/SONET option card to use ATM AAL--5 mode with PLCP disabled.

Intersite bandwidth

The intersite bandwidth can limit application locking and I/O performance (including volume shadowing or RAID set copy times) and the performance of the lock manager.

To promote reasonable response time, HP recommends that average traffic in either direction over an intersite link not exceed 60% of the link's bandwidth in that direction for any 10-second interval. Otherwise, queuing delays within the FDDI-to-WAN bridges can adversely affect application performance.

Remember to account for both OpenVMS Cluster communications (such as locking and I/O) and network communications (such as TCP/IP, LAT, and DECnet) when calculating link utilization.

Intersite delay

An intersite link introduces a one-way delay of up to 1 ms per 100 miles of intersite cable route distance plus the delays through the FDDI-to-WAN bridges at each end. HP recommends that you consider the effects of intersite delays on application response time and throughput.

For example, intersite link one-way path delays have the following components:

Calculate the delays for a round trip as follows:

WAN round-trip delay = 2 x (N miles x 0.01 ms per mile + 2 x 0.5 ms per FDDI-WAN bridge)

An I/O write operation that is MSCP served requires a minimum of two round-trip packet exchanges:

WAN I/O write delay = 2 x WAN round-trip delay

Thus, an I/O write over a 100-mile WAN link takes at least 8 ms longer than the same I/O write over a short, local FDDI.

Similarly, a lock operation typically requires a round-trip exchange of packets:

WAN lock operation delay = WAN round-trip delay

An I/O operation with N locks to synchronize it incurs the following delay due to WAN:

WAN locked I/O operation delay = (N x WAN lock operation delay) + WAN I/O delay

Bit error ratio

The bit error ratio (BER) parameter is an important measure of the frequency that bit errors are likely to occur on the intersite link. You should consider the effects of bit errors on application throughput and responsiveness when configuring a multiple-site OpenVMS Cluster. Intersite link bit errors can result in packets being lost and retransmitted with consequent delays in application I/O response time (see Section D.3.6). You can expect application delays ranging from a few hundred milliseconds to a few seconds each time a bit error causes a packet to be lost.

Intersite link availability

Interruptions of intersite link service can result in the resources at one or more sites becoming unavailable until connectivity is restored (see Section D.3.5).

System disks

Sites with nodes contributing quorum votes should have a local system disk or disks for those nodes.

System management

A large, multiple-site OpenVMS Cluster requires a system management staff trained to support an environment that consists of a large number of diverse systems that are used by many people performing varied tasks.

Microwave DS3 links

You can provide portions of a DS3 link with microwave radio equipment. The specifications in Section D.3.6 apply to any DS3 link. The BER and availability of microwave radio portions of a DS3 link are affected by local weather and the length of the microwave portion of the link. Consider working with a microwave consultant who is familiar with your local environment if you plan to use microwaves as portions of a DS3 link.

D.3.5 Availability Considerations

If the FDDI-to-WAN bridges and the link that connects multiple sites become temporarily unavailable, the following events could occur:

Many communication service carriers offer availability-enhancing options, such as path diversity, protective switching, and other options that can significantly increase the intersite link's availability.

D.3.6 Specifications

This section describes the requirements for successful communications and performance with the WAN communications services.

To assist you in communicating your requirements to a WAN service supplier, this section uses WAN specification terminology and definitions commonly used by telecommunications service providers. These requirements and goals are derived from a combination of Bellcore Communications Research specifications and a Digital analysis of error effects on OpenVMS Clusters.

Table D-2 describes terminology that will help you understand the Bellcore and OpenVMS Cluster requirements and goals used in Table D-3.

Use the Bellcore and OpenVMS Cluster requirements for ATM/SONET - OC3 and DS3 service error performance (quality) specified in Table D-3 to help you assess the impact of the service supplier's service quality, availability, down time, and service-interruption frequency goals on the system.

Note

To ensure that the OpenVMS Cluster system meets your application response-time requirements, you might need to establish WAN requirements that exceed the Bellcore and OpenVMS Cluster requirements and goals stated in Table D-3.

Table D-2 Bellcore and OpenVMS Cluster Requirements and Goals Terminology
Specification Requirements Goals
Bellcore Communications Research Bellcore specifications are the recommended "generic error performance requirements and objectives" documented in the Bellcore Technical Reference TR--TSY--000499 TSGR: Common Requirements. These specifications are adopted by WAN suppliers as their service guarantees. The FCC has also adopted them for tariffed services between common carriers. However, some suppliers will contract to provide higher service-quality guarantees at customer request.

Other countries have equivalents to the Bellcore specifications and parameters.

These are the recommended minimum values. Bellcore calls these goals their "objectives" in the TSGR: Common Requirements document.
OpenVMS Cluster In order for HP to approve a configuration, parameters must meet or exceed the values shown in the OpenVMS Cluster Requirements column in Table D-3.

If these values are not met, OpenVMS Cluster performance will probably be unsatisfactory because of interconnect errors/error recovery delays, and VC closures that may produce OpenVMS Cluster state transitions or site failover or both.

If these values are met or exceeded, then interconnect bit error--related recovery delays will not significantly degrade average OpenVMS Cluster throughput. OpenVMS Cluster response time should be generally satisfactory.

Note that if the requirements are only being met, there may be several application pauses per hour. 1

For optimal OpenVMS Cluster operation, all parameters should meet or exceed the OpenVMS Cluster Goal values.

Note that if these values are met or exceeded, then interconnect bit errors and bit error recovery delays should not significantly degrade average OpenVMS Cluster throughput.

OpenVMS Cluster response time should be generally satisfactory, although there may be brief application pauses a few times per day. 2


1Application pauses may occur every hour or so (similar to what is described under OpenVMS Cluster Requirements) because of packet loss caused by bit error.
2Pauses are due to a virtual circuit retransmit timeout resulting from a lost packet on one or more NISCA transport virtual circuits. Each pause might last from a few hundred milliseconds to a few seconds.

Table D-3 OpenVMS Cluster DS3 and SONET OC3 Error Performance Requirements
Parameter Bellcore Requirement Bellcore Goal OpenVMS Cluster Requirement1 OpenVMS Cluster Goal1 Units
Errored seconds (% ES) <1.0% <0.4% <1.0% <0.028% % ES/24 hr
  The ES parameter can also be expressed as a count of errored seconds, as follows:
  <864 <345 <864 <24 ES per 24-hr period
Burst errored seconds (BES) 2 <= 4 -- <= 4 Bellcore Goal BES/day
Bit error ratio (BER) 3 1 x 10 -9 2 x 10 -10 1 x 10 -9 6 x 10 -12 Errored bits/bit
DS3 channel unavailability None <= 97 @ 250 miles, linearly decreasing to 24 @ <= 25 miles None Bellcore Goal Min/yr
SONET channel unavailability None <= 105 @ 250 miles, linearly decreasing to 21 @ <= 50 miles None Bellcore Goal Min/yr
Channel-unavailable event 4 None None None 1 to 2 Events/year


1Application requirements might need to be more rigorous than those shown in the OpenVMS Cluster Requirements column.
2Averaged over many days.
3Does not include any burst errored seconds occurring in the measurement period.
4The average number of channel down-time periods occurring during a year. This parameter is useful for specifying how often a channel might become unavailable.

Table Key

D.4 Managing OpenVMS Cluster Systems Across Multiple Sites

In general, you manage a multiple-site OpenVMS Cluster using the same tools and techniques that you would use for any OpenVMS Cluster interconnected by a LAN. The following sections describe some additional considerations and recommends some system management tools and techniques.

The following table lists system management considerations specific to multiple-site OpenVMS Cluster systems:
Problem Possible Solution
Multiple-site configurations present an increased probability of the following failure modes:
  • OpenVMS Cluster quorum loss resulting from site-to-site communication link failure.
  • Site loss resulting from power failure or other breakdown can affect all systems at that site.
Assign votes so that one preferred site has sufficient votes to maintain quorum and to continue operation if the site-to-site communication link fails or if the other site is unavailable. Select the site with the most critical applications as the primary site. Sites with a few noncritical systems or satellites probably should not have sufficient votes to continue.
Users expect that the local resources will either continue to be available or will rapidly become available after such a failure. This might not always be the case. Consider the following options for setting user expectations:
  • Set management and user expectations regarding the likely effects of failures, and consider training remote users in the procedures to be followed at a remote site when the system becomes unresponsive because of quorum loss or other problems.
  • Develop management policies and procedures for what actions will be taken to identify and handle these failure modes. These procedures may include manually adjusting quorum to allow a site to continue.

D.4.1 Methods and Tools

You can use the following system management methods and tools to manage both remote and local nodes:

D.4.2 Shadowing Data

Volume Shadowing for OpenVMS allows you to shadow data volumes across multiple sites. System disks can be members of a volume shadowing or RAID set within a site; however, use caution when configuring system disk shadow set members in multiple sites. This is because it may be necessary to boot off a remote system disk shadow set member after a failure. If your system does not support FDDI booting, it will not be possible to do this.

See the Software Product Descriptions (SPDs) for complete and up-to-date details about Volume Shadowing for OpenVMS (SPD 27.29.xx) and StorageWorks RAID for OpenVMS (SPD 46.49.xx).

D.4.3 Monitoring Performance

Monitor performance for multiple-site OpenVMS Cluster systems as follows:


Index Contents