EMU User Guide
Previous page...
| Contents
3.3.7 Network menu
Using the items in this menu, an overview of the various networks
can be obtained. This section is under construction (hard hats must
be worn at all times and is closed to the public).
3.3.8 System menu
This section is used to monitor and debug the system. Each menu
item in the system menu shows a specific aspect of the system.
Most of these tools will not be useful to a normal user but may be
helpful to the system manager should problems occur on your
system. Many of the tools presented here were used as
development tools and remain useful in quickly isolating problems
either within EMU or demonstrating that problems are external.
3.3.8.1 Monitor Listener
The main Listener and PSR interface is presented
as a continuously updated real-time monitor. The display appears
as follows:
CCCI Ltd 1993 Ethernet Monitor Listener Display
System:
Event flags = 80000000 01FE0000
Queued Lowest Error Cmplt IOSB Err Processed
28 0 72694 15391978 0 50480
Frame errors: Mcast Src Zero Src Zero Dest Zero PTY
0 0 0 9350
Name Status Limit Sent Disc In Proc High
RECORD 00000000 20 0 0 0 0
ETHNET 00000001 16 4716223 219803 1 16
IP 00000005 15 1224266 24677 5 15
MOP 00000005 11 60083 0 0 8
DECNET 00000005 12 726942 1462 1 12
LAT 00000005 10 156502 0 0 6
ARP 00000007 10 1224261 5 0 10
LAVC 00000005 13 409631 0 1 7
IPX 00000005 11 844851 4964 0 11
BRIDGE 00000005 4 156261 0 0 3
OSI 00000005 8 678413 187 0 8
The display is interpreted as follows:
-
Event flags are flags used throughout the system to signal events.
Normal operation has only the top flag permanently set and on
occasion you will notice some flags changing rapidly. If any flag
other than the highest on remains set (not 0) it usually indicates
the process associated with this flag is no longer responding.
Report this to software support.
- Queued is the number of buffers queued to Ethernet. As long as
this remains above 0 all is well. If it reaches 0 and remains
there there is a serious problem with the system. Report this to
software support.
- Lowest is the lowest number of buffers queued at any 1 time. This
will normally start at the number queued and progress towards 0 as
the system uses resources.
- Error. The number of Ethernet read errors. This should always be
a
very low number (less than .001% of completions. Note there is a
bugette here - the count regularly goes up for unknown reasons and
there is no other known impact.
- Cmplt - Completions. The number of Ethernet reads. The Listener
is
reading every frame on the Ethernet and the display updates 1 per
second so the rate this number rises at is the number of frames
per second currently on the segment EMU is attached to.
- IOSB error. The number of times the Listener successfully
received
a frame but the frame was in error. This should always be 0. If
this number rises there is either a fault on the local LAN (most
likely) or serious EMU system fault (Less likely) or an operating
system fault (Least likely).
- Processed. The listener discards most frames read as being not
useful to EMU purposes. This includes any frame containing user
data. The remaining frames are processed for network management
information. This count is the number of frames processed and this
number, expressed as a ratio of Completions gives a rough picture
of the efficiency of the attached LAN.
-
Frame errors. Each Ethernet frame is validated and any errors
found are recorded here. They are: Multicast source. A frame was
received with a multicast source address. This is not allowed by
Ethernet rules and can trigger a broadcast storm. It cannot be
determined who sent the frame. Zero Source. A frame was received
with the top 3 bytes of the address set to 0. This is not a valid
Ethernet address. Zero Destination. Same as in Zero Source. Zero
PTY. The protocol type in the frame was 0. This is meaningless as
this type is not defined. No station acting in accordance with
Ethernet rules will receive this frame.
- The remainder of the display shows the processors that are
active.
Each processor has it's own set of counters and are interpreted as
follows:
- Name. Is simply the name this processor is known by.
- Status. A bit pattern showing the current processing status.
Only the bottom 3 bits are used:
Table 3-2 PSR Status Bits
Bit |
Meaning When Set |
0 |
Processor is able to receive |
1 |
Type Checked ¹ |
2 |
All Traffic ² |
Notes:
- Type checked. This processor receives frames
only of the type specified in PSRTBL
- All traffic. Most processors receive only
multicast traffic. If this bit it is set it
receives all traffic on that protocol type.
See warnings in PSRTBL before changing any of these
bits.
3.3.8.2 Monitor PSRs
PSRs are the Protocol Specific Routines. There is 1 for each
protocol EMU processes. The display appears as follows:
CCCI Ltd 1993 Ethernet Monitor PSR Display
Name Rec Ret Fmterr Comm Err NoIPC Alt Rlt Name
OSI 681229 681229 0 0 0 0 0 0 0
LAVC 411312 411312 0 0 0 0 0 0 0
LAT 157144 157144 0 0 1 0 0 3 0
MOP 60324 60324 106 0 0 0 0 4470 0
IP 1229358 1229358 0 0 0 0 0 19 0
IPX 849122 849122 44444 0 0 0 0 0 0
ETHNET 4737078 4737078 0 0 0 0 0 7 0
RECORD 0 0 0 0 0 0 0 0 0
BRIDGE 156907 156907 0 0 0 0 0 0 0
The display has been compressed in this example to fit on the
page. It is interpreted as follows:
- Name. The Name (Usually the protocol name) the routine is known
by.
- Rec. The number of frames the PSR received from the Listener.
- Ret. The Number of frames processed and returned to the Listener.
This should always be equal to Rec.
- Fmterr. Format error. A fame with unexpected format. Usually this
is cause by a device sending a faulty frame. If the number rises
dramatically, it can indicate the processor is faulty. Report this
to software support. Not in the above display the high number
being reported by IPX. This is being investigated.
- Comm. The number of command buffers received. Internal processes
can send commands to PSRs to affect their processing.
- Err. The number of error messages this processor has written to
the error log. If this number rises quickly, a review the error
log should indicate the problem. If the problem cannot be
understood or rectified, please contact software support.
- NoIPC. No Inter Process Communication Buffer. EMU processes
communicate using a common buffer pool. If a process requests a
buffer and none are available this event is counted. A small (Less
than 5 per hr) number is expected. These counts are used in the
auto tuning process and often a restart of EMU will retune the
system and alleviate this problem.
- Alt. The number of Alerts this processor has raised.
- Rlt. The number of relater frames this processor has sent.
Relater
frames carry information that one processor determines is 'of
interest' to another. This information is relayed in Relater
frames.
- Name. No longer used.
3.3.8.3 Monitor Probes
Probes are the
processes that interrogate cooperating nodes on the network for
configuration and performance data. Currently EMU supports SNMP
for IP addresses, CMIP for OSI addresses, NICE for DECnet, MOP,
LAT and Ethernet protocols. This display charts the amount of data
EMU is adding to the network and is displayed as follows:
Probe Sent Receive Error NoRespose
SNMP 10424 9215 472 1250
CMIP 1364 2301 742 82
NICE 95 539 0 20
MOP 139 139 0 0
LAT 1713 1713 0 0
Note the LAT processor reads only from the local node - it does not
send
any data across the network. The same is true of the Ethernet probe.
The display is interpreted as follows:
- Name. The name the probe is known by.
- Sent. The number of queries sent by the probe
- Receive. The number of good responses received
- Error. The number of responses received indicating an error in
either the request or the response. Each response of this type is
logged in the error log.
- NoResponse. The number of frames sent that received no response.
3.3.8.4 View Sections
EMU builds a number of internal databases in memory sections.
Selecting this item displays a further menu of sections that can
be viewed directly.
For most sections there is a standard format displayed. The
display is split into 3 parts:
- Top level. Gives information about the section being viewed
Entries: The number of physical entries in the section
Recsize: The size of each entry
Max Entries: The maximum number of entries the section will
currently contain. If this number of entries is reached, the
controlling process gains exclusive access to the section, writes
it out to a file, deletes the section, creates a larger section
and restores the data. It then releases access to the system.
- PID: This is the internal identifier EMU uses to target a
particular
process, and therefore it's section. This PID relates to the Event
Flags in the Listener display.
- Middle Level: Not currently used.
- Main Level. Each PSR section displays the standard header and
then
the PSR specific data. The common part is documented here.
- Common Header:
- Flags: This is the PID of this process
- Boxid:This is a number generated internally and used to relate
the
various protocols found on the network to the various devices
detected. Throughout this system a BOX is synonymous with a
single physical device. Thus a router, PC or VAX are all BOXes
with a single BOXID associated
- Ptypbits. A bit pattern indicating the protocols
detected on this device. A bit set in this field corresponds to
the PID of a protocol this device is running. That is to say, if
bit 12 is sent in this field, EMU has determined that protocol 12
(Ethernet) is running on this device.
- Control. A bit pattern
indicating the status of this record. Currently 3 bits are used to
indicate:
- Record is deleted. This area may be overwritten at any
time.
- Update. This record has been changed since list time the
corresponding probe updated the record. It will be updated in the
next cycle.
- NoPoll. Do not poll this device on this protocol. May
be set by the user or the system. The system will set this
bit if the device responds with any indication that
it does not allow logins.
- Last. The last time EMU detected a frame from this address on
this
protocol. If the last time recorded exceeds the Ageout for
this protocol, the record will be deleted.
- FST. The first time EMU detected a frame from this address on
this protocol. Helpful when trying to determine events
around a particular incident. This event is always alerted.
- ALT. Time last alert was sent for this device on this
protocol.
- Status. Not currently implemented
- Accesses. The number of times this record was looked up. The
search is always done sequentially and during section build,
the records are sorted such that records accessed most often
are nearer the top of the section. This speeds up searching
dramatically. Note: Not really true. This was (and remains) the
intention but in fact, it is not implemented.
- Len. The length of the main lookup key in this section. The
location of the key is constant in all sections and combined
with this field, allows a single common search routine to
operate.
- HowSet. How this device was found and the process that
caused it
to be created. Changed as a result of other problems. This field
is now always set to the 'parent' protocol regardless of who
actually set it. Makes it a bit useless.
- Current Readers. The number of process currently reading
this
record. Combined with Current Writers, these fields are used
to organise access to the record so that any retrieval is
valid. That is a reader will not retrieve a half written
record and writers do not overwrite each other's data.
- Updt. Time this device was last updated on this protocol.
- Lpol. Last time this device was polled on this protocol.
- Polls. The number of times this device was polled on this
protocol. This field along with Responses allows EMU to
determine if it should continue polling a device that is
unlikely to respond.
- Interval. The number of seconds (Default = 3 days) between
updates
- Support. A bit pattern showing the various management
functions
this device supports on this protocol. The pattern is
specific to the protocol.
3.3.8.5 Dump_Database
The database can be formatted and dumped.
Selecting this item displays a prompt for file name. If a file
name is entered, the dump is sent to both screen and file. If the
prompt is not answered, the dump is to screen only. Be careful
here - the database is usually quite long - 15,000+ lines will
normally be printed.
3.3.8.6 Trace
This is a debugging tool used to see in real time the
frames passed between EMU processes. Currently 2 processes support
the trace facility:
- EBUFFS. These are the buffers passed from the listener to the
PSRs. The trace shows these buffers as they are send from the
listener to the PSRs and the PSRs returning them to the listener.
- Relater. These buffers are passed from the PSRs to the relater
and are the controlling inputs to that process. Should any node
display incorrect relationships (like an address that it does not
have or the reverse), this trace can be used to find out why.
3.3.8.7 Scanner
The scanner process scans the database for parameters
collected from the network via polling that should appear in the
PSR level databases. A small database detailing which parameters
to search for and where to send them is controlled by this
interface. If you value your life, do not change these fields.
Incorrect entries in this database cause immediate and
irreparable brain damage to the system.
3.3.8.8 Translate Tables
The network is parameter driven and the value of those parameters
is what EMU collects, analyses, stores and displays. Most
parameters are not in any form suitable for human consumption and
therefore must be translated. Most of the parameters in this area
are not documented in any central location nor is there any
definitive list. The tables provided are the best known to the
developers but cannot be guaranteed to be either complete or
correct. In this subsystem you can edit the source tables and
compile them into the system for use by the parameter translation
mechanism. There are currently 4 tables provided:
- Netware SAP. This is a code showing the service(es) any Netware
node broadcasts and the correct translation of this is very useful
as it describes in a few words the major use of this device.
- MOP Device. Any node supporting Digital's MOP protocol broadcasts
a
frame describing itself on a regular basis. One of the parameters
is it's device type - again very useful in describing in a word or
two what this node is.
- Ethernet type. Any protocol on the Ethernet must send it's frames
in Ethernet format. In the Ethernet format is a field describing
the protocol the application is using. One of the simple tricks
EMU pulls is simply reading that field for each frame and therefore
is able to list all the protocols any device runs. This table
translates the codes into meaningful words.
+
- Ethernet UOI (Universal Organisation Identifier).
In simpler times, this field was known as the
manufacturer's code. Even more simply: The top 3 bytes of any
hardware address is a code showing the manufacturer of the
Ethernet device in the box.
This table controls that translation.
Note
+ Ethernet types are actually in 2 formats: Type
II (a 2 byte type field) or IEEE (a 2 byte SSAP/DSAP).If DSAP/SSAP
is 'AAAA' then it is extended Ethernet(a 3rd type!) and a 5 byte
field is used. The 5 byte field is in fact the 2 byte type coded
with the UOI prepended. If you find this confusing do not panic -
it is not a test of networking skills so much as a test of
patience.
3.4 Alert
For a number of reasons (mostly performance and laziness) the user
interface to the ALERT subsystem is not implemented under the main
EMU menu system but is carried separately. This interface allows
you to monitor real time alerts and browse through previous alerts
conveniently. To start the alert interface:
$ ALERT
This brings up a screen with a 3 item menu.
3.4.1 Main Menu
The main menu in the alert system allows you to:
- Monitor. Selecting this item clears the screen and waits for
alerts from the system. As each alert is received, it is formatted
and displayed here. To return to the main menu, press Ctrl Z.
- Review. Selecting this item allows you to browse through the
alerts previously generated and logged. All alerts are logged to
the file EMU5_DAT:ALERT.DAT. It is advised to check the size of
this file on occasion as there is no mechanism to control it's
size and on an active (some say flaky) network many alerts can be
generated over a short time. Selecting this item brings up another
menu:
- Since. Allows you to set the time from which previously logged
alerts are re-displayed. The format is in VMS standard absolute
time (DD-MMM-YYYY HH:MM:SS.CC) and any part missing from the input
will default to current time (though you MUST include the
separating characters. For example if you input '-- 13' you will
get all alerts since 13:00 (1 PM) from today. No input will
display all alerts from the time the alert logging file was
created - usually system start time.
- NodeName. Displays alerts only for the node specified.
Wildcards
will be allowed once this section is finished - that is it is not
yet implemented.
- Class. Alerts are divided into classes: FAULT, CONFIGURATION,
ACCOUNTING, PERFORMANCE, SECURITY. This is the OSI standard
(FCAPS) arrangement and seemed such a good idea, we stole it.
In addition, EMU defines a SYSTEM category to allow the system
itself to report problems it believes need operator assistance.
In the current version, only CONFIGURATION alerts are generated.
Note this function is not yet implemented.
- Execute. Once the selection criteria has been set up (Currently
once time has been established) selecting this item causes the
selected alerts to be displayed 1 at a time in the screen. The
display screen allows 'next' and 'previous' alerts to be
displayed.
3.4.2 Alert Format
All alerts are displayed in a standard format:
- Line 1 Shows the class of the alert and the device the alert is
for. If the system knows the name of the device it is used,
otherwise the address of the device is displayed.
- Line 2 shows the date and time the alert was received and the
subsystem that sent it. This will usually be a PSR(Protocol
Specific Routine)
- Line 3 will usually be the subclass. In the case of configuration
alerts there are 3 subclasses: ADD, DELETE or MODIFY.
- If there is additional information this follows and will normally
be a list of parameters that were added, deleted or modified.
3.5 Utilities
Utilities are those routines that are little used, not to be used
by the uninitiated or otherwise inappropriate to be included in
the main user interface. It is firmly advised that these routines
be well understood before use. The effect of misuse can be
anything from nothing to death by brain removal. In the
later case, the effect may be quite subtle with death taking a
long time and inflicting pain and frustration on all affected.
3.5.1 Error Log
The error log file EMU5_LOG:EMU_LOG.ERR is a typeable file (that
is you can type/print/edit it) that the system writes errors and
other significant information to. An extract of the file and
interpretation follow:
%RELATR-W-VERADDR, 22-JUN-1998 11:50:00.08 RELATER Invalid Target
addr 002A54F4 For PID 0000000C From 00000006
-----------------------------------------------------------
%CFGMON-E-SNDALT, 22-JUN-1998 11:51:40.21 NETMON SEND_ALERT Error
00000601
-----------------------------------------------------------
%NODSCN-I-ADDIP, 23-JUN-1998 12:23:05.07 CFGMON Added node
171.144.141.128
-----------------------------------------------------------
Notes:
- The first section is the facility(RELATR), message level (I =
Information, W = warning, E = Error) and short form of the error
message (VERADDR).
- The second part is the time and date the message was logged.
- The remainder is variable and is intended to provide a complete
picture of the problem (or not). The messages above mean:
- The relater received a frame containing an invalid address.
The
MOP module (00000006) sent address 002a54f4 specifying it as an
Ethernet address. This is not a valid Ethernet address.
- The Network Monitor process sent an alert message that was too
large for the alert buffer. In EMU error messages, any time a
message contains the keyword 'Error', it is always followed by a
VMS standard error code. To translate this code type at DCL:
$ WRITE SYS$OUTPUT F$MESSAGE("%X601")
In this case, code 601 is buffer overflow.
- The NodScn process found another IP address and added it to the
database.
Next page...
|
Contents