HP TCP/IP Services for OpenVMS
Management

5.2.3 Creating and Displaying Home Interfaces

failSAFE IP addresses can be created with a designated home interface. By default, all primary IP addresses are created with a home interface. A home interface provides a preferential failover and recovery target in an effort to always migrate IP addresses to their home interface, thereby limiting the disruption to users.

You can use the ifconfig utility to create and display addresses configured with home interfaces. For example, to create three addresses, enter the following commands:

$ ifconfig ie0 10.10.10.1 ! primary has home interface by default $ ifconfig ie0 alias 10.10.10.2 ! alias does not $ ifconfig ie0 home alias 10.10.10.3 ! create alias with home interface

Although the TCP/IP management command SET INTERFACE can be used to create primary and alias addresses, it does not allow you to create the home alias address. You must use the ifconfig utility to do this.

When addresses are displayed by the ifconfig utility, those addresses with a home interface are marked with an asterisk (*). For example:

$ ifconfig ie0 IE0: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> *inet 10.10.10.1 netmask ff000000 broadcast 10.255.255.255 inet 10.10.10.2 netmask ff000000 broadcast 10.255.255.255 *inet 10.10.10.3 netmask ff000000 broadcast 10.255.255.255

The asterisk indicates that the addresses 10.10.10.1 and 10.10.10.3 have a home interface of IE0.

Note

The TCP/IP management command SHOW INTERFACE does not identify addresses with a home interface.

Creating IP addresses with home interfaces spreads the IP addresses across multiple interfaces. This is useful for load-balancing and gaining higher aggregate throughput. If a home interface recovers after a failure, the addresses may return to their recovered home interface, thus maintaining the spread of addresses across the available interfaces.

Note

The IP address will not migrate toward a home interface while that address has active connections.

5.3 Managing failSAFE IP

The failSAFE IP service monitors the state of interfaces and, upon detecting a failure or recovery, takes the appropriate action. To start and stop the failSAFE IP service, run the following command procedures:

SYS$STARTUP:TCPIP$FAILSAFE_STARTUP.COM
SYS$STARTUP:TCPIP$FAILSAFE_SHUTDOWN.COM

The failSAFE IP service performs the following actions:

Monitors the state of interfaces by periodically reading their Bytes Received counter.
When required, marks an interface as failed or recovered.
Maintains static routes to ensure they are preserved after interface failure or recovery.
Logs all messages to TCPIP$FAILSAFE_RUN.LOG. Important events are additionally sent to OPCOM.
Invokes a site-specific command procedure. For more information about the site-specific command procedures, see Section 22.1.1.
Generates traffic to help avoid phantom failures, as described in Section 5.3.5.3.

If the failSAFE IP service is not enabled, configuring a failSAFE IP address across nodes provides identical functionality to the IP cluster alias, as described in Section 1.4.

5.3.1 failSAFE IP Logical Names

You can use logical names to customize the operating environment of failSAFE IP. The logical names must be defined in the LNM$SYSTEM_TABLE for them to take effect.

Table 5-3 describes the failSAFE IP logical names.

Table 5-3 failSAFE IP Logical Names
Logical Name Description

TCPIP$FAILSAFE Specifies the configuration file that is read by TCPIP$FAILSAFE during startup. This logical must be defined prior to starting the failSAFE IP service. The default file specification is
SYS$SYSDEVICE:[TCPIP$FSAFE] -
TCPIP$FAILSAFE.CONF.

TCPIP$FAILSAFE_FAILED_ ifname Simulates a failure for the named interface ( ifname). This logical name is translated each time failSAFE IP reads the LAN counters.
To determine the interface name, use the TCP/IP management command SHOW INTERFACE.

TCPIP$SYFAILSAFE Specifies the name of a site-specific command procedure that is invoked when one of three conditions occurs: interface failure, retry failure, or interface recovery. The default file specification is SYS$MANAGER:TCPIP$SYFAILSAFE.COM.

TCPIP$FAILSAFE_LOG_LEVEL Controls the volume of log messages sent to OPCOM and the log file. This logical is translated each time failSAFE IP logs a message. The default value is 0.

TCPIP$FSACP_LOG_LEVEL Controls the volume of log messages sent to OPCOM by the ACP. This logical should be used only when directed by customer support. The default value is 0.

**Table 5-3 failSAFE IP Logical Names**
Logical Name	Description
TCPIP$FAILSAFE	Specifies the configuration file that is read by TCPIP$FAILSAFE during startup. This logical must be defined prior to starting the failSAFE IP service. The default file specification is SYS$SYSDEVICE:[TCPIP$FSAFE] - TCPIP$FAILSAFE.CONF.
TCPIP$FAILSAFE_FAILED_ ifname	Simulates a failure for the named interface ( ifname). This logical name is translated each time failSAFE IP reads the LAN counters. To determine the interface name, use the TCP/IP management command SHOW INTERFACE.
TCPIP$SYFAILSAFE	Specifies the name of a site-specific command procedure that is invoked when one of three conditions occurs: interface failure, retry failure, or interface recovery. The default file specification is SYS$MANAGER:TCPIP$SYFAILSAFE.COM.
TCPIP$FAILSAFE_LOG_LEVEL	Controls the volume of log messages sent to OPCOM and the log file. This logical is translated each time failSAFE IP logs a message. The default value is 0.
TCPIP$FSACP_LOG_LEVEL	Controls the volume of log messages sent to OPCOM by the ACP. This logical should be used only when directed by customer support. The default value is 0.

5.3.2 Customizing failSAFE IP

You can create a site-specific command procedure to be invoked under specified circumstances, such as when an interface fails. You can customize the command procedure to handle the following circumstances:

When the interface first appears to have stopped responding. This is the first warning that a problem may exist, but no action to failover IP addresses has been taken yet.
When an attempt to generate traffic on the interface fails. After the retry limit is reached, the interface is deemed as malfunctioning, and IP addresses are removed from the interface. Failover occurs.
When the interface recovers.

The default site-specific command procedure is:

SYS$MANAGER:TCPIP$SYFAILSAFE.COM

To modify the location or file name, define the logical name TCPIP$SYFAILSAFE.

Use the following text strings as parameters to the command procedure:

P1 is the interface name (for example, IE0)
P2 is the state. The states are:
- INFO_STATE
- WARN_STATE
- ERROR_STATE

The TCPIP$SYFAILSAFE procedure is invoked by the TCPIP$FSAFE account, which by default has minimum privileges and quotas. It is necessary to ensure the TCPIP$SYFAILSAFE procedure is both readable and executable by the TCPIP$FSAFE account. In addition, the TCPIP$FSAFE account may require additional quotas and privileges so that it can execute all the commands contained within the TCPIP$SYFAILSAFE procedure.

5.3.3 Reestablishing Static and Dynamic Routing

When an interface fails, failSAFE IP removes all addresses and static routes from the failed interface. The static routes are reestablished on every interface where the route's network is reachable. This action can result in the creation of a static route on multiple interfaces and is most often observed with the default route.

You may need to restart dynamic routing to ensure that the dynamic routing protocol remains current with changes in the interface availability. If this is necessary, restart the routing process using the following TCP/IP management commands:

TCPIP> STOP ROUTING /GATED TCPIP> START ROUTING /GATED

For GATED, failSAFE IP can be configured to scan the interfaces periodically for any changes. Use the GATED configuration option scaninterval . You can scan the interfaces manually using the following TCP/IP management command:

$ TCPIP SET GATED/CHECK_INTERFACES

For more information about routing protocols, see Chapter 4.

5.3.4 Displaying the Status of Interfaces

The failSAFE IP service periodically reads the network interface card (NIC) Bytes Received counter to determine the status of an interface. You can display the Bytes Received counter using the LANCP utility. For example, to view the Bytes Received counters for all interfaces, enter the following command:

$ pipe mcr lancp show device/count | search sys$pipe "Bytes received"/exact

The types of events that prevents the Bytes Received counter from changing include:

Failing interface hardware
Disconnected physical link
Shutting the interface down using TCP/IP management commands
Shutting down TCP/IP Services
Shutting down a node

5.3.5 Guidelines for Configuring failSAFE IP

This section describes guidelines that can help avoid common pitfalls in configuring failSAFE IP.

5.3.5.1 Validating failSAFE IP

Most contemporary networks are highly stable and rarely suffer from the problems that require failSAFE IP. Consequently, on the few occasions where failSAFE IP is required, it is critical that the service be validated in the environment where it is being deployed. Failure to do this can result in unexpected problems at the critical moment.

Since real failures are rare and sometimes difficult to simulate, the logical name TCPIP$FAILSAFE_FAILED_ifname is provided. After configuring failSAFE IP addresses and starting the failSAFE IP service, validate the configuration using the following procedure:

Establish connections and generate IP traffic.
Using TELNET or FTP, create incoming and outgoing TCP connections to the multihomed host from inside and outside the subnet. Verify that these connections are established by using the following commands:

$ @SYS$MANAGER:TCPIP$DEFINE_COMMANDS.COM $ ifconfig -a ! Check the interface addresses $ netstat -nr ! Check the routing table $ netstat -n ! identify which interfaces are being used

Simulate a failure and observe.
Simulate a failure and observe OPCOM and log file messages. Use the following command to simulate a failure:
$ DEFINE/SYSTEM TCPIP$FAILSAFE_FAILED_ifname 1 ! or disconnect the cable
Wait long enough for failover to occur, which is signaled by OPCOM messages. Then observe the effects of failover and verify that TCP connections are still established and can transfer data. For example, TELNET sessions should respond to keyboard input.
Use the following commands to see how IP addresses and routing have changed:
$ ifconfig -a ! Observe how the addresses have migrated $ netstat -nr ! Observe how the routing table has changed

Recover from the simulated failure and observe the OPCOM messages.

$ DEASSIGN/SYSTEM TCPIP$FAILSAFE_FAILED_ifname ! or reconnect the cable $ ifconfig -a ! Observe how the addresses have migrated $ netstat -nr ! Observe how the routing table has changed

Again, ensure that TCP connections are still established and can transfer data.

Note

Simulating a failure with the logical name TCPIP$FAILSAFE_FAILED_ifname does not disrupt physical connections and therefore is not an accurate indicator of whether the services will survive a real failover situation. Consequently, this procedure should be repeated by physically removing a network cable from one or more of the interfaces. Since this action might be disruptive to network services, it should be scheduled during a maintenance period, when disruption can be tolerated.

5.3.5.2 Configuring Failover Time

The key concern for configuring the failSAFE IP service is the time it takes to detect a failure and for the standby IP address to become active. One goal of a failSAFE IP configuration is to avoid disrupting existing connections, so the failover time must be within the connection timeout.

The minimum failover time is calculated as:

INFO_POLL + (WARN_POLL * RETRY)

The maximum failover time is calculated as:

(2 * INFO_POLL) + (WARN_POLL * RETRY)

For explanations of the variables, see Table 5-2. The default values (INFO_POLL=3, WARN_POLL=2, RETRY=1) result in a failover range of 5 to 8 seconds. Note that this does not take into account the system load.

The recovery time will be less than the ERROR_POLL period, which is, by default, 30 seconds.

5.3.5.3 Avoiding Phantom Failures

The health of a NIC is determined by monitoring the NIC's Bytes Received counter. This provides a protocol-independent view of the NIC counters. However, in a quiet network, there may be insufficient traffic to keep the Bytes Received counter changing within the failover detection time, thus causing a phantom failure. To counteract this, the failSAFE service attempts to generate MAC-layer broadcast messages, which are received on every interface on the LAN except for the sending interface.

Consequently, in a quiet network with only two interfaces being monitored by the failSAFE IP service, a single NIC failure can also result in a phantom failure of the other NIC, since the surviving NIC is not able to increase its own Bytes Received counter.

You can reduce phantom failures in a quiet network by configuring the failSAFE IP service for at least three interfaces on the LAN. If one interface fails, the surviving interfaces continue to maintain one another's Bytes Received counters.

5.3.5.4 Creating IP Addresses with Home Interfaces

By default, the interface on which a primary IP address is created is its home interface, whereas an IP alias address is created without a home interface. To create an alias address with a home interface, use the ifconfig command, which should be added to the SYS$STARTUP:TCPIP$SYSTARTUP.COM procedure. For example, use the following command to create an alias address of 10.10.10.3 on interface IE0 and to designate IE0 as its home interface:

$ ifconfig ie0 home alias 10.10.10.3/24

5.3.5.5 Private Addresses Should Not Have Clusterwide Standby Interfaces

Private addresses are those that are used for network administration and are not published as well-known addresses for well-known services. A standby interface for a private address should be configured on the same node as the home interface. This avoids a situation in which a node cannot assign any addresses to its interfaces if they have active connections on another node in the cluster.

If you want to associate the list of private addresses with a public DNS alias name, you should use the load broker to provide high availability of the DNS alias. The load broker is described in Chapter 7.

Part 2
BIND

Part 2 provides information on configuring and managing the TCP/IP Services name server and includes the following chapters:

Chapter 6, Configuring and Managing BIND Version 9, describes how to configure and manage the TCP/IP Services implementation of the Berkeley Internet Name Domain (BIND) software. Note that BIND Version 9 is not supported on VAX systems. For information about configuring BIND Version 8 (which runs on both VAX and Alpha systems), see Chapter 6.
Chapter 7, Using DNS to Balance Work Load, describes how to use BIND's round-robin scheduling or the load broker for cluster load balancing.

Chapter 6
Configuring and Managing BIND Version 9

The Domain Name System (DNS) maintains and distributes information about Internet hosts. DNS consists of a hierarchical database containing the names of entities on the Internet, the rules for delegating authority over names, and mail routing information; and the system implementation that maps the names to Internet addresses.

In OpenVMS environments, DNS is implemented by the Berkeley Internet Name Domain (BIND) software. HP TCP/IP Services for OpenVMS implements a BIND server based on the Internet Software Consortium's (ISC) BIND Version 9.

Note

The BIND Version 9 server is not supported on VAX systems. The BIND Version 8 server runs on both VAX and Alpha systems.

Note

In this version of TCP/IP Services, the BIND Server and related utilities have been updated to use the OpenSSL shareable image SSL$LIBCRYPTO_SHR32.EXE. There is now a requirement that this shareable image from OpenSSL V1.2 or higher be installed on the system prior to starting the BIND Server.

This chapter contains the following topics:

How to migrate your existing BIND 4 environment to BIND 9 ( Section 6.3)
How to configure BIND using the BIND configuration file ( Section 6.5), including:
- How to configure dynamic updates ( Section 6.5.7)
- How to configure a DNS cluster failover and redundancy environment ( Section 6.5.8)
How to populate the BIND server databases ( Section 6.6)
How to examine name server statistics ( Section 6.7)
How to configure BIND using SET CONFIGURATION BIND commands ( Section 6.8)
How to configure the BIND resolver ( Section 6.9)
How to use the BIND server administrative tools ( Section 6.10)
How to troubleshoot BIND server problems ( Section 6.12)

6.1 Key Concepts

This section serves as a review only and assumes you are acquainted with the InterNIC, that you applied for an IP address, and that you registered your domain name. You should also be familiar with BIND terminology, and you should have completed your preconfiguration planning before using this chapter to configure and manage the BIND software.

If you are not familiar with DNS and BIND, see the HP TCP/IP Services for OpenVMS Concepts and Planning guide. If you need more in-depth knowledge, see O'Reilly's DNS and BIND, Fourth Edition. You can find the BIND 9 Adminstrator Reference Manual at http://www.isc.org/ .

6.1.1 How the Resolver and Name Server Work Together

BIND is divided conceptually into two components: a resolver and a name server. The resolver is software that queries a name server; the name server is the software process that responds to a resolver query.

Under BIND, all computers use resolver code, but not all computers run the name server process.

The BIND name server runs as a distinct process called TCPIP$BIND. On UNIX systems, the name server is called named (pronounced name-dee). Name servers are typically classified as master (previously called primary), slave (previously called secondary), and caching-only servers, depending on their configurations.

6.1.2 Common BIND Configurations

You can configure BIND in several different ways. The most common configurations are resolver-only systems, master servers, slave servers, forwarder servers, and caching-only servers. A server can be any of these configurations or can combine elements of these configurations.

Servers use a group of database files containing BIND statements and resource records. These files include:

The forward translation file, domain_name.DB
This file maps host names to IP addresses.
The reverse translation file, address.DB
This file maps the address back to the host names. This address name lookup is called reverse mapping. Each domain has its own reverse mapping file.
Local loopback forward and reverse translation files, LOCALHOST.DB, 127_0_0.DB, and 0_0_0_0_0_0_IP6.ARPA (for IPv6).
These local host databases provide forward and reverse translation for the widely used LOCALHOST name. The LOCALHOST name is always associated with IPv4 address 127.0.0.1 and IPv6 address ::1, and is used for loopback traffic.
The hint file, ROOT.HINT
This file contains the list of root name servers.

A configuration file, TCPIP$BIND.CONF, contains statements that pull all the database files together and governs the behavior of the BIND server.

6.1.2.1 Master Servers

A master server is the server from which all data about a domain is derived. Master servers are authoritative, which means they have complete information about their domain and that their responses are always accurate.

To provide central control of host name information, the master server loads the domain's information directly from a disk file created by the domain administrator. When a new system is added to the network, only the database on the master server needs to be modified.

A master server requires a complete set of configuration files: forward translation, reverse translation, configuration, hint, and loopback files.

6.1.2.2 Slave Servers

Slave servers receive authority and their database from the master server.

A particular domain's database file is called a zone file; copying this file to a slave server is called a zone file transfer . A slave server assures that it has current information about a domain by periodically transferring the domain's zone file from the master. Slave servers are also authoritative for their domains.

Configuring a slave server is similar to configuring a master server. The only difference is that, for the slave server, you need to provide the name of the master server from which to transfer zone data.

Note

If you create a master, slave, or forwarder server for the same domain on which your local host resides, you should reconfigure your BIND resolver so that it uses this system (LOCALHOST) as its name server.

Slave servers require a configuration file, a hint file, and loopback files.

Contents

Index

HP TCP/IP Services for OpenVMSManagement

5.2.3 Creating and Displaying Home Interfaces

5.3 Managing failSAFE IP

5.3.5 Guidelines for Configuring failSAFE IP

5.3.5.2 Configuring Failover Time

Part 2BIND

Chapter 6Configuring and Managing BIND Version 9

6.1 Key Concepts

HP TCP/IP Services for OpenVMS
Management

Part 2
BIND

Chapter 6
Configuring and Managing BIND Version 9