NetNumen M31(RAN) Mobile Network Element Management System
Maintenance Guide Version 3.10.420
ZTE CORPORATION ZTE Plaza, Keji Road South, Hi-Tech Industrial Park, Nanshan District, Shenzhen, P. R. China 518057 Tel: (86) 755 26771900 Fax: (86) 755 26770801 URL: http://ensupport.zte.com.cn E-mail:
[email protected]
LEGAL INFORMATION Copyright © 2006 ZTE CORPORATION. The contents of this document are protected by copyright laws and international treaties. Any reproduction or distribution of this document or any portion of this document, in any form by any means, without the prior written consent of ZTE CORPORATION is prohibited. Additionally, the contents of this document are protected by contractual confidentiality obligations. All company, brand and product names are trade or service marks, or registered trade or service marks, of ZTE CORPORATION or of their respective owners. This document is provided “as is”, and all express, implied, or statutory warranties, representations or conditions are disclaimed, including without limitation any implied warranty of merchantability, fitness for a particular purpose, title or non-infringement. ZTE CORPORATION and its licensors shall not be liable for damages resulting from the use of or reliance on the information contained herein. ZTE CORPORATION or its licensors may have current or pending intellectual property rights or applications covering the subject matter of this document. Except as expressly provided in any written license between ZTE CORPORATION and its licensee, the user of this document shall not acquire any license to the subject matter herein. ZTE CORPORATION reserves the right to upgrade or make technical change to this product without further notice. Users may visit ZTE technical support website http://ensupport.zte.com.cn to inquire related information. The ultimate right to interpret this product resides in ZTE CORPORATION.
Revision History Revision No.
Revision Date
Revision Reason
R1.1
July 31, 2009
Review and Test Comments Modification
R1.0
June 27, 2009
First Edition
Serial Number: sjzl20092882
Contents
Preface............................................................... i Safety Instructions............................................1 Safety Precautions.......................................................... 1 Safety Signs .................................................................. 1 Safety Guidelines ........................................................... 3
Maintenance Overview ......................................7 Introduction................................................................... 7 Maintenance Classifications .............................................. 7 Common Maintenance Checks and Precautions ................... 9
Daily Routine Maintenance .............................. 11 Daily Routine Maintenance Items .....................................11 Checking Equipment Room Temperature ...........................12 Checking Equipment Room Humidity ................................13 Checking Running Status of Air-conditioning......................14 Checking Communication Between Server and Client..........14 Checking Communication Between Server and Low-Level.............................................................16 Checking Hardware........................................................18 Checking Server ............................................................19 Checking Dual Server System Operation Status .................20 Checking Alarms ...........................................................21 Checking Operation Logs ................................................23 Checking Disk Array.......................................................24 Backing up Data ............................................................24 Checking System Load ...................................................25 Checking Virus Monitoring Result in Real Time ...................26
Weekly Routine Maintenance ......................... 29 Weekly Routine Maintenance Items ..................................29 Checking History Alarm ..................................................30 Checking Scheduled Query Task ......................................30 Calibrating System Time.................................................31
Checking Database Space ...............................................32 Checking Shared Folders on Client ..................................33 Updating Antivirus Software............................................34 Checking Server Progress Status .....................................35 Backing up Data ............................................................36 Checking Server Error Logs ............................................37 Checking Server Disk Storage Capacity ............................38 Checking Client Hard Disk Space ....................................39
Monthly Routine Maintenance ......................... 41 Monthly Routine Maintenance Items .................................41 Checking Power Supply Voltage .......................................42 Cleaning Dust on Cabinet ...............................................42 Checking Table Space Usage ..........................................43 Checking Performance Statistic Result of Operation Network ..............................................................44 Checking Dual System Operation Status ..........................45 Monitoring Test .............................................................46
Quarterly Maintenance .................................... 49 Quarterly Maintenance Items ..........................................49 Checking Dual Server System Switching ...........................50 Checking Ground Resistance ...........................................53 Clearing History Alarm ...................................................53 Deleting Performance Database .......................................55 Checking Unauthorized Access of Server ...........................57 Checking Firewall ..........................................................57 Checking LAN Equipment ................................................58
Maintenance Record Form ............................... 59 Daily Maintenance Form .................................................59 Weekly Maintenance Form ..............................................60 Monthly Maintenance Form .............................................62 Quarterly Maintenance Form ...........................................63
Common Alarms and Faults Handling .............. 65 Overview......................................................................65 Equipment Alarms .........................................................66 Communication Alarms with Lower-level NM..................66 DB Space Insufficient .................................................66 CPU Occupancy of Application Server above Threshold .........................................................67
Memory Occupancy of Application Server over Threshold .........................................................67 Space for Application Server Logs Insufficient................68 Catalog Capacity over Threshold ..................................68 Common Faults Handling ................................................69 AMO Startup Error .....................................................69 Alarm Forwarding Failure ............................................69 Hard Disk Space Insufficiency .....................................70 Client Fails to connect with Server ...............................70 Server Unable to Start ...............................................70 Database Unable to Start............................................71 Data Table Space Full .................................................72 Alarm Box Unable to Display Audio-Visual Prompt ..........73 Performance Report Problem Caused by Incorrect Time Zone Settings ...................................................74 Chaotic Performance Alarm Data Caused By Synchronization Error between Upper and Lower Level ...............................................................74 Error When Reporting Lower Level Net Manager Configuration Data.............................................76 Error in Reporting Lower Level Net Manager Performance Data ..............................................76 Northbound Interface Link Interruption.........................76 Failure in Reporting Configuration Data to NMS ..............77 Error in Report Alarm Information to NMS .....................78 Failure in Reporting Performance Data to NMS ...............78
Data Backup and Recovery .............................. 81 Log Files Backup ...........................................................81 NetNumen Log Files Backup in Windows OS ..................81 NetNumen Log Files Backup in Solaris OS .....................83 CM Data Backup and Recovery ........................................85 NetNumen CM Data Backup and Recovery.....................85 Database Tables Backup and Recovery .............................93 All Users Database Backup and Recovery ........................ 103 Fault Management database backup............................... 103
Emergency Maintenance................................ 109 Emergency Maintenance Purpose ................................... 109 Principle of Emergency Maintenance............................... 110 Emergency Maintenance Flow........................................ 110 Service Check ............................................................. 111
Fault Record ............................................................... 111 Locating Fault ............................................................. 112 Emergency Recourse.................................................... 112 Service Recovery ......................................................... 113 Information Record ...................................................... 114 Information Collection .................................................. 114 Emergency Maintenance Tables ..................................... 114 Abnormality Record Table ......................................... 114 Equipment Emergency Maintenance Requisition ........... 115 Troubleshooting Record Table .................................... 116
Figures .......................................................... 117 Tables ........................................................... 121 Glossary ........................................................ 123
Preface Purpose
This manual provides procedures and guidelines that support the maintenance operation on NetNumen M31(RAN) Mobile Network Element Management System.
Intended Audience
This manual is intended for engineers and technicians who perform maintenance activities on NetNumen M31(RAN) Mobile Network Element Management System.
Prerequisite Skill and Knowledge
To use this document effectively, users should have a general understanding of wireless telecommunications technology. Familiarity with the following is helpful:
What Is in This Manual
�
NetNumen M31(RAN)system and its various components.
�
Local operating procedures of NetNumen M31(RAN) system
This manual contains the following chapters as shown in Table 1: TABLE 1 CHAPTER SUMMARY Chapter
Summary
Chapter 1, Safety Instructions
It gives Safety instructions in the NetNumen M31(RAN) maintenance process.
Chapter 2, Maintenance Overview
Introduces maintenance classifications and common maintenance check and precautions.
Chapter 3, Daily Routine Maintenance.
Introduces the NetNumen M31(RAN) daily maintenance list of items and the maintenance methods.
Chapter 4, Weakly Routine Maintenance
Introduces the NetNumen M31(RAN)weakly maintenance list of items and the maintenance methods.
Chapter 5, Monthly Routine Maintenance
Introduces the NetNumen M31(RAN) monthly maintenance list of items and the maintenance methods.
Chapter 6 ,Quarterly Routine Maintenance
Introduces the NetNumen M31(RAN) quarterly maintenance list of items and the maintenance methods.
Chapter 7, Maintenance Record Form
Introduces maintenance forms.
Chapter 8, Common Alarms and Faults Handling
Introduces common alarms of NetNumen M31(RAN) system and also introduces the troubleshooting to common faults of NetNumen M31(RAN) system.
Confidential and Proprietary Information of ZTE CORPORATION
i
NetNumen M31(RAN) Maintenance Guide
ii
Chapter
Summary
Chapter 9, Data Backup and Recovery
Explains how to take backup and recovery of log file and CM data.
Chapter 10, Emergency Maintenance
Introduces the purpose and maintenance flow and provides the related tables of emergency maintenance.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
1
Safety Instructions Table of Contents Safety Precautions.............................................................. 1 Safety Signs ...................................................................... 1 Safety Guidelines ............................................................... 3
Safety Precautions BSS system can only be installed, operated and maintained by duly trained and qualified personnel. Basic requirements to the BSS maintenance personnel are as follows: �
BSS system theoretical knowledge.
�
Familiar with BSS device principle and networking.
�
Experience with network optimization skill.
Observe the local safety specifications and relevant operating procedures during equipment installation, operation and maintenance, to avoid any personal injury or damage to the equipment. The safety precautions in this manual can only be used as a supplement to local safety regulations.
Note: ZTE shall not bear any liabilities incurred by violation of the universal safety operation requirements, or violation of the safety standard for designing, manufacturing and using the equipment.
Safety Signs Safety Signs
Meaning No Smoking: Smoking is forbidden.
Confidential and Proprietary Information of ZTE CORPORATION
1
NetNumen M31(RAN) Maintenance Guide
Safety Signs
Meaning No flammables: Flammables materials are not allowed. No touching: Do not touch.
Universal alarm: Common safety precautions. Electric shock: There is high voltage risk. Static protection: The device may be sensitive to static electricity.
Microwave danger: There is strong electromagnetic field. Laser danger: There is a strong laser beam.
The safety reminder falls into four severity levels: Danger, Warning, Caution and Attention. Literal description of the safe prompt is on the right of the safety sign and detailed description is under the safety sign. The format is as follows.
Danger: Indicates an imminently hazardous situation, which, if not avoided, will result in death or serious injury. This signal word should be limited to only extreme situations.
Warning: Indicates a potentially hazardous situation, which, if not avoided, could result in death or serious injury.
Caution: Indicates a potentially hazardous situation, which, if not avoided, could result in minor or moderate injury. It may also be used to alert against unsafe practices.
2
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 1 Safety Instructions
Note: Indicates a potentially hazardous situation, which, if not avoided, could result in injuries, equipment damage or partial interruption of services.
Safety Guidelines Tools
Electric Shock: Make sure to use special tools rather than common tools for highvoltage and AC operations.
High Voltage
Danger: High voltage is hazardous. Direct or indirect contact with high voltage or mains supply using a wet object may result in death. Follow the local safety rules to install AC power equipments. Do not wear watch, bracelet, ring or other conducting objects during operation. Person who is installing AC equipment must be qualified for highvoltage and AC operations. Prevent moisture from entering the equipment during operation in a damp environment.
Power Cable
Danger: Never install or uninstall power cables while they are live. Power cables, when contacting a conductor, may cause sparks or electric arc and cause a fire or damage eyes. Switch off the power supply before installing or disconnecting a power cable. Make sure that connecting cable and its label meet actual installation requirements before connection.
Confidential and Proprietary Information of ZTE CORPORATION
3
NetNumen M31(RAN) Maintenance Guide
Drilling Holes
Warning: Do not drill the cabinet without permission! Without permission drilling may damage wiring and cables inside the cabinet. Additionally, metal pieces inside the cabinet created by the drilling may result in a shorted circuit board. Use protection gloves and move cables within the cabinet away before using a drill. Pay attention to protect eyes while drilling, for splashing sweepings may damage eyes. After drilling sweep cleanly. Lightening
Electric Shock: Strictly forbid to operatie in high-voltage, AC, iron tower or mast in a thunderstorm. Take lightning and grounding measures in time, during thunderstorms may rise strong electromagnetic field in the atmosphere. Antistatic
Electrostatic: Static electricity produced by human body may damage static-sensitive components on circuit boards, such as large-scale Integrated Circuit (IC). Friction caused by human body activities is root cause of electrostatic charge accumulation. Static voltage carried by a human body can be up to 30 kV in a dry environment, and can remain in human body for a long time. Operator with static may discharge electricity through the component and cause damage to it while touching equipment or holding a plug-in board, circuit board, IC chip and others. Wear an antistatic wrist strap and ground the other end well to prevent human static electricity from damaging sensitive components. Connect a resistor over 1 MΩ in series on the cable between antistatic wrist strap and grounding point, to protect the operator against accidental electric shock. Resistance over 1 MΩ is low enough to discharge static voltage. Antistatic wrist strap usage must be subjected to regular check. Do not replace the cable of an antistatic wrist strap with any other cables. Keep static-sensitive board away from objects that easily generate static electricity, like friction of package bag, transfer box and transfer belt made from insulation plastic. They may cause static electricity on components and discharge of static electricity may damage components when they contact human body or the ground. Keep boards in anti-static bags during storage and transportation.
4
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 1 Safety Instructions
Discharge static electricity of test device before use, that is, ground it first. Keep boards away from a strong DC magnetic field, such as cathode-ray tube of the monitor. Keep them at least 10 cm away. Others
Do not perform maintenance or debugging inside the equipment independently without a qualified personal for help. Replacing or changing any part in the equipment might result in an unexpected damage. Be sure not to replace any part or change the equipment unless authorized. Please contact ZTE Corporation if you have any questions.
Confidential and Proprietary Information of ZTE CORPORATION
5
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
6
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
2
Maintenance Overview Table of Contents Introduction....................................................................... 7 Maintenance Classifications.................................................. 7 Common Maintenance Checks and Precautions ....................... 9
Introduction Maintenance is a periodic inspection of the equipments to find the problems during its operation. Solve the problems as early as possible to avoid potential dangers.
Maintenance Classifications During routine maintenance, it is needed to locate the defects using some methods. The common maintenance methods include the following nine kinds. �
Checking Alarms and Operation Log When the defect takes place, maintenance personnel first checks the alarm and operation log files. It checks the alarms management and operation log interface of NetNumen M31 in BSS operation maintenance center. Alarm message management is performed to analyze the alarm messages that may occur during the running of the system to determine its running status and handle them accordingly. The NetNumen M31 records all error information or running parameters of the system, when it is running. The analysis of log file and alarm database information may point to the root cause of the fault. Using this method engineer can analyses the potential system problems.
�
Analysis using Status Indicators Usually the equipment has status indicators, which may help the user assess the running conditions of the equipment. For example, usually the server has power and fault indicators, the HUB has port indicators, etc. The status indicators help analyze the faulty part and even the cause.
Confidential and Proprietary Information of ZTE CORPORATION
7
NetNumen M31(RAN) Maintenance Guide
This method requires maintenance personnel be familiar with the Indicator light state and its meaning on various boards. �
Analyzing through performance management interface The performance analyzing method is done through performance management interface of NetNumen M31 in BSS Operation Maintenance Subsystem. Through the performance management interface, the maintenance personnel may implement performance management and signaling tracing for BSS system. Through the performance management interface, user may create each performance measure task, produce each performance report, understand the BSS system’s performance indexes. By analyzing the information, the maintenance personnel may detect load distribution status of the network in time, and regulate network parameters to enhance the network performance. Through signaling tracing interface, user may trace the signaling which BSS involves (including Gb port signaling), facilitating review of various signaling flow during system deployment and maintenance process, and user may detect various types of problems in the signaling coordination process.
�
Analyzing through instruments and meters One of the most useful troubleshooting methods is to use test instruments and meters that measure the system running indexes and environmental indexes. User can compare those indexes with normal instruments and meters to find out the causes for the errors. Maintenance personnel use test handset, spectrometer, signaling analyzer, error code analyzer and other auxiliary equipments, to perform the fault analysis, fault location and elimination.
�
Plug-out and Plug-in correction method When a board defect is detected, user may plug-in and plugout the boards or the external interface ports, so as to fix defects caused by poor connection.
�
Comparison method and swap method Comparison method is the method of comparing possibly defective board with similar board in the system (for example boards of same slot position in multi-module) for operational status, jumper or connection wire comparison. Through the comparison, user may determine whether the board has failure. Swap method is helpful to replace a potentially damaged board with a new part to see if the problem is removed. If yes, it means that the source of the problem is located. Swap method is also called as trial and error method, simple and practically feasible in almost every fault handling. However, note that this operation must be performed in safe time. In addition, comparing the status, parameters, log file, and configuration parameters of the potentially damaged parts with the normal values helps find out any inconsistency. Modifying and testing those values at a safe time can sometimes solve the problems.
8
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 2 Maintenance Overview
�
Isolation method When system has faults in some parts, user may separates the faulty parts from the related board or rack, to judges whether the failure is caused by inter connecting.
�
Self-checking method When system or single board is restarted, user may check for error through self-checking. Commonly engineering personnel can check board performance when system restarted, in which the indicators would flash with a certain regular pattern. By observing the flashing pattern, user may determine whether the board has problem.
�
Synthesis method In the actual operation, above methods will be combined. With the experience of maintenance personnel, various kind of defects in the Maintenance process could be fixed.
Common Maintenance Checks and Precautions Checks and precautions for daily maintenance are as follows: �
Maintain normal temperature/humidity in the equipment room. Keep the environment tidy and clean, free from dust and moisture. Keep rodents or insects out of the equipment room.
�
Ensure that the primary power of the system is stable and reliable, and check the system grounding and lightning protection ground periodically. Especially before storm seasons and after thunder storms, check the lightning protection system to make sure all the facilities are in good conditions.
�
Establish a full-blown system for maintaining the equipment room, to govern the routine work of the maintenance personnel. An exhaustive duty log should be maintained, to provide details about system operation, version, data change, upgrading and troubleshooting on day-to-day basis for follow-up analysis and troubleshooting in the event of a fault. Prepare a shift record to clearly define everyone’s responsibility.
�
Do not play games or access Internet through terminal. Do not install, run and copy any other software unrelated with the system at the terminal. Do not use the terminal for other purposes.
�
Set different NM passwords for different access rights, put them under strict control and change them periodically. Also, keep the NM passwords accessible only to the maintenance personnel.
�
The maintenance personnel should receive pre—job training, to learn about the equipment and networks. During maintenance operations, they must follow the instructions described in the manuals related to the NetNumen M31 equipment. Be-
Confidential and Proprietary Information of ZTE CORPORATION
9
NetNumen M31(RAN) Maintenance Guide
fore contact with the hardware, they must wear an antistatic wrist strap to avoid any static discharge damaging the system. The maintenance personnel shall be strict in their work and proficient in maintenance, and improve their maintenance proficiency through constant learning.
10
�
Do not reset the equipment, load or change equipment data at will. In particular, never change the data contained in the NM database at random. Back up the data before making any changes. Do not delete the backup data until the equipment runs properly for some time (usually a week) after changing the data. Maintain a record of the data changes in time.
�
The following common user instruments and meters shall be handy: screwdriver (cross or flat head), signaling analyzer, network cable clamp, multimeter, AC power used for maintenance, telephone line and network cables. Be sure to test and calibrate the instruments and meters at regular intervals, to ensure their accuracy.
�
Check the spare parts and components regularly. Be sure to maintain a sufficient inventory of common spare parts and components, and verify that they are in good condition. Make sure that they are free of moisture and mould. The spare parts and components must be isolated from those defective ones replaced during maintenance, and labeled for easy identification. The common-use spare parts and components must be replenished in time when used up.
�
The software and documentations for potential use during maintenance shall be kept at hand at certain place, so that they are easy to reach for when necessary.
�
Make sure that the lighting in the equipment room is good enough for the maintenance. Once damaged, the lighting equipment shall be repaired in time, to avoid any inconvenience during maintenance.
�
Resolve any fault as soon as it is detected. In case of any problem that cannot be resolved, please contact the local ZTE office.
�
Put the contact information of the local ZTE office in a prominent place, and make it known to all the maintenance personnel. This would help to contact ZTE for support in time. Update latest contact information if it has been changed.
�
It is prohibited to run signaling trace program, especially in daytime when traffic load is busy. Signaling trace can be performed only under permission from ZTE local office, and in low communication traffic period.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
3
Daily Routine Maintenance Table of Contents Daily Routine Maintenance Items .........................................11 Checking Equipment Room Temperature ...............................12 Checking Equipment Room Humidity ....................................13 Checking Running Status of Air-conditioning .........................14 Checking Communication Between Server and Client..............14 Checking Communication Between Server and Low-Level .........16 Checking Hardware............................................................18 Checking Server ................................................................19 Checking Dual Server System Operation Status .....................20 Checking Alarms ...............................................................21 Checking Operation Logs ....................................................23 Checking Disk Array...........................................................24 Backing up Data ................................................................24 Checking System Load .......................................................25 Checking Virus Monitoring Result in Real Time .......................26
Daily Routine Maintenance Items Table 3 lists the check items to check everyday. TABLE 3 LIST
OF
DAILY ROUTINE MAINTENANCE ITEMS
Daily maintenance items Environment monitoring maintenance
Check Items Temperature and humidity in equipment room Air-conditioning running status Communication between server and client
Maintenance of running status of main devices
Communication between server and lower level NM Hardware
Confidential and Proprietary Information of ZTE CORPORATION
11
NetNumen M31(RAN) Maintenance Guide
Daily maintenance items
Check Items Server Dual Server Operational status Alarms Disk Array operational status Operation logs Hardware of disk array Data backup System Load Virus Monitoring Result
Checking Equipment Room Temperature Purpose
Method Reference standard
The equipment runs normally within an appropriate temperature range. By checking equipment room temperature, it can be avoid that an over-high or over-low temperature makes adverse influences on the system running. Take the temperature reading from thermometer. Refer to Table 4 for required temperature of NetNumen M31 (RAN) system. TABLE 4 NETNUMEN M31 SYSTEM TEMPERATURE REQUIREMENTS Temperature Equipment Type
NetNumen M31 (RAN) Element Management system
Long-term working conditions
Short-term working conditions
0℃~40 ℃
-5℃~45 ℃
Note: The temperature and humidity of the operating environment inside the equipment room are measured at the spot that is 1.5m above the floor and 0.4m before the rack when there is no protective plate in front or back of the equipment rack.
Note:
12
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
Short-term working conditions mean that the successive operating time is not more than 48 hours and the accumulated operating time per year is not more than 5 days.
Checking Equipment Room Humidity Purpose
Method Reference Standard
The equipment runs normally within an appropriate humidity range. By checking equipment room humidity, it can be avoid that an over-high or over-low humidity makes adverse influences on the system running. Read the humidity level from hygrometer. Refer to Table 5 for required temperature of NetNumen M31 system. TABLE 5 NETNUMEN M31 SYSTEM TEMPERATURE REQUIREMENTS Temperature Equipment Type
NetNumen M31 Element Management system
Long-term working conditions
Short-term working conditions
20 % ~ 90 %
5 % ~ 95 %
Note: The temperature and humidity of the operating environment inside the equipment room are measured at the spot that is 1.5m above the floor and 0.4m before the rack when there is no protective plate in front or back of the equipment rack.
Note: Short-term working conditions mean that the successive operating time is not more than 48 hours and the accumulated operating time per year is not more than 5 days.
Confidential and Proprietary Information of ZTE CORPORATION
13
NetNumen M31(RAN) Maintenance Guide
Checking Running Status of Air-conditioning Purpose
Perform this procedure to ensure the normal running status of airconditioning and maintain the humidity and temperature of the equipment room in normal range.
Procedure
Check the running status of air-conditioning. Repair or replace it if there are any abnormalities.
Result
The air-conditioning runs normally.
Checking Communication Between Server and Client Context
Perform this procedure to ensure that the communication between the server and client of NetNumen M31 (RAN) is normal:
Steps
1. Click Start > Run on the client of NetNumen M31(RAN) system and ping the IP address of the server to check if the communication between them is normal. 2. Start NetNumen M31 (RAN) server console on server to examine whether process initiation could start, as shown in Figure 1. This includes FTP process and NetNumen M31 (RAN) network management process. FIGURE 1 INSPECT COMMUNICATION BETWEEN NETWORK MANAGEMENT SERVER AND CLIENT—I
14
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
Note: If NetNumen M31 needs to link to NMS, user must enter CORBA notice service process tab window before FTP process and NetNumen M31 (RAN) network management process self starts, click start button to start CORBA service manually. User also needs to check whether CORBA notice service to starts normally. 3. On client computer, start NetNumen M31 client and input user name and password, as shown in Figure 2. Inspect whether log in succeeds. FIGURE 2 INSPECT COMMUNICATION BETWEEN NETWORK MANAGEMENT SERVER AND CLIENT-II
4. After successfully log in, the interface appears, as shown in Figure 3.
Confidential and Proprietary Information of ZTE CORPORATION
15
NetNumen M31(RAN) Maintenance Guide
FIGURE 3 INSPECT COMMUNICATION BETWEEN NETWORK MANAGEMENT SERVER AND CLIENT - III
END OF STEPS Result
Expected Result: The response of the server is normal.
Checking Communication Between Server and Low-Level Context
Perform the following procedure to ensure that the communication between the server and low level NM of NetNumen M31(RAN) system is normal.
Steps
1. Click Start > Run on the server of NetNumen M31 system and ping the IP address of the low—level NM machine of the server to check if the communication between them is normal. 2. Log in NetNumen M31(RAN) client, and enter into the Topology Management view. 3. Inspect server and OMM relation Topology Map, as shown in Figure 4.
16
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
FIGURE 4 INSPECT COMMUNICATION BETWEEN NETWORK MANAGEMENT SERVER AND LOWER LEVEL NETWORK MANAGEMENT (1)
4. If link line is not shown between Net Elements in the map, then select topology root node NetNumen and right-click on it and single click on EXPAND ONE LAYER from pop-up menu, as shown in Figure 5. FIGURE 5 INSPECT COMMUNICATION BETWEEN NETWORK MANAGEMENT SERVER AND LOWER LEVEL NETWORK MANAGEMENT (2)
5. Inspect lines between NetNumen and each OMM. Green line means communication is normal, black line means no management, red line means link interrupted.
Confidential and Proprietary Information of ZTE CORPORATION
17
NetNumen M31(RAN) Maintenance Guide
6. Check for interrupted links and their alarms of NetNumen and OMM. 7. Run ftp
< FTP port> command on NetNumen to check whether FTP link to OMM works, as shown in Figure 6 . FIGURE 6 CHECK COMMUNICATION BETWEEN NM SERVER AND LOWER LEVEL NM (3)
Note: Here OMM is FTP SERVER, uep is default user of FTP.10.61.100.113 is IP address of OMM. 21 is default FTP port, whose value is input according to actual condition END OF STEPS Result
Expected Result: The status of communication link between the server and the low level NM is normal.
Checking Hardware Context Steps
Perform the following steps to ensure the hardware works normally. 1. Check if the indicators of routers and switches are normal. 2. Check the alarms of links in NetNumen M31 (RAN) system to see if the link between server and client as well as the link between server and interface machine are normal. 3. Check if there are any fault indications (if the yellow indicator on the lower part of the disk flashes, it indicates that the disk
18
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
faulty) related to hard disks of dual-computer server and disk array in NetNumen M31 (RAN) system. END OF STEPS Result
Expected Result are: �
Indicators of routers and switches are normal.
�
The link between server and client as well as the link between server and interface machine are normal.
�
There are no fault indications related to the hard disks of dualserver and disk array.
Checking Server Context
Perform this procedure to ensure the server of NetNumen M31 (RAN) system runs normally.
Steps
1. Check if there is any abnormal information on the service interface of application server when NetNumen M31 (RAN) system is running, as shown in Figure 7. FIGURE 7 INSPECT SERVER OPERATION STATUS
Confidential and Proprietary Information of ZTE CORPORATION
19
NetNumen M31(RAN) Maintenance Guide
Note: If NetNumen M31 requires NMS connection, user needs to check CORBA notification service progress too. 2. On server, run #ps -ef|grep java java command to check operation status of server. END OF STEPS Result
Expected Result: There is no abnormal information such as interrupted communication link.
Checking Dual Server System Operation Status Short Description
User may wants to restart or shut down the system for to some particular reason. For different operation objects, the methods of restarting or shutting down are different. For a backup node, user may restart or shut down directly. But for master node, user must switch global resource group to back up system before restarting or shutting down master node. This ensures NetNumen application to provide non-stop service, as manual shutting down or restarting does not switch global resource group automatically.
Context
Perform the following checks to check dual server system operation status:
Steps
1. Open server cabinet, inspect the front panel of each server equipment for red indicator alarm. 2. Inspect each server monitor for any abnormal alarm. 3. On server, execute #ps -ef|grep java to check server operation status. 4. Execute #hagui & command log in VCS GUI to check proper logging in operation into VCS. 5. On server terminal, execute # vxprint -htr command, to check system general status . 6. On server terminal, execute #vradmin -gdgnetnumen1 repstatus rvgnetnumen1 command, to check data copy status. 7. On server terminal, execute # vxdisk list command, to check hard disk status. 8. On server terminal, execute # vxprint -lP command, to check link status. 9. On server terminal, execute # vxdisk -o alldgs list disk group command, to check disk group status.
20
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
10. On server terminal, execute # vxprint -l lvomc command, to check volume status.
Note: For more dual system operation, refer to NetNumenM31 (RAN) (V3.10.410) Mobile Network Element Management System High Availability Feature Description Manual. END OF STEPS
Checking Alarms Context
Check alarm information of the managed network element in management system to find the running status of the system as follows:
Steps
1. Check system alarm in Fault Management to see if there are any alarms or fault indications. i.
After successful logging into the NetNumen M31(RAN) client click View > Fault Management, Fault Management window appears, as shown in Figure 8. In Fault Management interface by default Realtime Alarm Monitor window appears. FIGURE 8 REAL TIME ALARM MONITOR WINDOW
ii. To view Realtime Notification Monitor click View > Realtime Notification Monitor in Fault management interface.
Confidential and Proprietary Information of ZTE CORPORATION
21
NetNumen M31(RAN) Maintenance Guide
2. Check the operation logs of NetNumen M31 system to see if there are any abnormal records related to the operations. 3. Check the real-time alarms, notifications and current alarms to find out the causes of alarms and then solve the problem. At the same time, remember to back up the alarms. To view the detailed information of the alarms double click on particular alarm, it NetNumen application displays Details window opens, as shown in Figure 9. FIGURE 9 SHOWING ALARM DETAILS
Note: For more details about how to take alarm backup, refer to chapter Data Backup and Recovery. END OF STEPS Result
22
Expected Result: �
There are no alarms or other abnormal indications in fault management.
�
There are no abnormal records in operation logs.
�
The cause of the alarm is found out and alarm backup is completed.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
Checking Operation Logs Context
Check operation log , security log and operation log of NetNumen M31(RAN) to obtain operating status of the system. �
operation log This log records users operation information, including log ID, operation level, user name, operation name, host address, command function/detailed info, operation result, cause of failure, access method, operation objective, operation starting time, operation ending time and related log information.
�
Security Log This log records user logging in information, including log ID, user name, host address, log name, operation time, access method and detailed information.
�
System log This log records server status of scheduled tasks, including log ID, level, source, log name, detailed information, host address, operation start time.
Steps
1. Click View > Log Management menu, enter log management view. 2. Select Operation Log from left pane of Long Management window, all log operations displays in right pane, as shown in Figure 10. Check for abnormal records in Operation Log window. FIGURE 10 OPERATION LOG WINDOW
3. Follow the same procedure to check Security Log and System Log. END OF STEPS
Confidential and Proprietary Information of ZTE CORPORATION
23
NetNumen M31(RAN) Maintenance Guide
Result
Expected Result: There are no abnormal operation records and no unauthorized operations, which may adversely influence the system running.
Checking Disk Array Context
Perform this procedure to ensure that the disk array works normally and to check if there is enough free disk space.
Steps
1. Check for adequate free space in the hard disks of disk array to avoid interruption of alarm and performance data collection performed by network management and of report generation. 2. Check if hard disks have any faults related to writing or reading. Please contact ZTE for help if there are any faults. 3. Hard disk checking procedure: i.
Check Indicator
ii. Run Format command on NetNumen server as root user and print current normal hard disk. �
�
The disk array must have enough free space. If the free space is not enough, add more disks or delete alarm and performance data after backup. The hard disk has no writing or reading problems. If there are any, please contact ZTE for help.
END OF STEPS
Backing up Data Context
Back up the data to ensure the quick system recovery, in case of system crash. Data Backup is done for NM configuration data. It is implemented with a tool under database management system. Data backup conditions and period: �
Be sure to backup the data before any data change.
�
Back up the data after configuration is performed.
�
If no data is modified, it is recommended to backup data on weekly basis.
�
In addition to hard disk, backed up data should be stored on the Managed Object (MO) disk.
Data backup modes include immediate backup and periodic backup (weekly). �
24
Immediate backup
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
Immediate backup refers to the backup conducted by data backing up tool in NetNumen M31 (RAN) system. This tool helps in database backup of configuration data and can be used to quickly restore the system before the change had occurred. �
Periodic backup (weekly) For more details refer to Weekly Routine Maintenance Items.
Note: Alarm data back up includes history alarms and notification alarms, performance data backup includes primary data, hourly data and weekly data, log data backup includes operation log, security log and system log . The data above are manually or automatically backed up by table group task, in which automatically backup is performed periodically. When defining table group task, user may define solution of cleaning data.
Steps
1. For detailed information about how to take backing up of alarm data, performance data and log data, refer to Database Tables Backup and Recovery 2. For configuration data backup CM Data Backup and Recovery. END OF STEPS
Result
�
Check system log in log management, check for task execution status. For detailed operation refer to specific operation.
�
Check save path of data backup for backed up data.
Checking System Load Context Steps
Checking system load is to see if the load exceeds the threshold. 1. Log on OMC client and check if there is any over-threshold alarms. 2. On System Management click Application Serve > Server performance>. Server performance window appears as shown in Figure 11.
Confidential and Proprietary Information of ZTE CORPORATION
25
NetNumen M31(RAN) Maintenance Guide
FIGURE 11 SERVER PERFORMANCE WINDOW
END OF STEPS Result
Expected Result: Server performance is good with low usage level.
Checking Virus Monitoring Result in Real Time Context Steps
Perform following procedure to check if the OS has virus. 1. Check the realtime monitoring system of anti-virus software on NM server and NM client. 2. For the files infected with virus, kill virus by related anti-virus software. 3. Solaris system does not need antivirus software. It usually does not infect virus when SAMBA service is not running. With default installation configuration, Solaris system does not run SAMBA service, user may execute ps -ef |grep samba command to check if SAMBA is started. If SAMBA service is run-
26
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 3 Daily Routine Maintenance
ning, then execute /etc/init.d/samba stop command at root user to stop SAMBA service.
Note: In Unix system , SAMBA is a software package for sharing Unix file and print service between computers in network by server message block protocol SMB. END OF STEPS Result
Expected Result: There is no virus in NM server and NM client.
Confidential and Proprietary Information of ZTE CORPORATION
27
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
28
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
4
Weekly Routine Maintenance Table of Contents Weekly Routine Maintenance Items ......................................29 Checking History Alarm ......................................................30 Checking Scheduled Query Task ..........................................30 Calibrating System Time.....................................................31 Checking Database Space...................................................32 Checking Shared Folders on Client ......................................33 Updating Antivirus Software................................................34 Checking Server Progress Status .........................................35 Backing up Data ................................................................36 Checking Server Error Logs ................................................37 Checking Server Disk Storage Capacity ................................38 Checking Client Hard Disk Space ........................................39
Weekly Routine Maintenance Items List of Weekly Maintenance Items of NetNumen M31(RAN) Mobile Net Element Management system is as shown in Table 6. TABLE 6 WEEKLY ROUTINES FOR MAINTENANCE CHECKLIST Test Item
Category
Check Historical Alarm. Check Server Progress Status. Calibrate System Time. Running Status of Main Equipment
Check Disk space of the server. Check Shared Directory on Client Computer. Update Antivirus software .
Confidential and Proprietary Information of ZTE CORPORATION
29
NetNumen M31(RAN) Maintenance Guide
Category
Test Item Periodic data backup. Operation and maintenance log backup. Check Client Hard Disk Space. Check Server Error Log. Check Server Disk Capacity.
Checking History Alarm Context
Check the history alarms to analyze for the cause of alarm and solve the problem. Procedure to check the history alarm is shown below:
Steps
1. From main window of Netnumen client, Click View > Fault Management to pop-up Fault Management window. 2. From left pane of Fault Management window right-click on Network Element (NE) to select Show History Alarms from the pop-up menu. 3. History Alarm window appears which shows history alarm list of the particular NE. END OF STEPS
Checking Scheduled Query Task Context
Check whether scheduled query task has been executed successfully and whether it expires or suspended.
Note: Performance management sub-system counts measurement of some performance parameters and service data to present performance index of the system. Through scheduled query task, the system periodically counts performance of specified object as scheduled to acquire system performance. Procedure to check the scheduled query task is as follows:
Steps
30
1. Click View > Performance Management form main menu of NetNumen to enter Performance Management window.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 4 Weekly Routine Maintenance
2. From Performance Management window click Performance Management > Timer Query Task Management. 3. Timer Query Task Management window will pops up to show the scheduled query task for user to check for necessary tasks . 4. If the tasks required does not exist, user may create new scheduled query task and configure it to the items and periods as required.
Note: For more information refer to NetnumenM31(RAN ) Mobile Network Element Management System Performance Management Operation Guide. 5. If the required task exists, it is suggested to check the validation period of the task, and modify the date if it is to expiring soon. 6. Select a task in Timer Query Task Management list whose execution log is shown in the Timer Query Task Log List below. Check result of execution, as sown in Figure 12. FIGURE 12 TIMER QUERY TASK
7. A survey report file will be generated accordingly after successful completion of the task. END OF STEPS
Calibrating System Time Context
To make sure system time of OMM is consistent with Netnumen server management system.
Confidential and Proprietary Information of ZTE CORPORATION
31
NetNumen M31(RAN) Maintenance Guide
Steps
1. Run date command to check time on Netnumen Server. 2. Check the time on OMM. 3. Check OMM for time consistency with Netnumen Server. 4. If inconsistency exists, refer to topic Chaotic Performance Alarm Data Caused By Synchronization Error between Upper and Lower Level . END OF STEPS
Checking Database Space Context
Check database storage space to avoid interference caused by insufficient storage space in the system during normal operation. Perform the following procedure to check database space.
Steps
1. Log in NetNumen M31 client and click View > System Management. 2. System pops up System Management window. From System Management click on Database > Database Login. 3. Database Login window pops up. Input Password and click OK, as shown in Figure 13. FIGURE 13 CHECK DATABASE SPACE (1)
4. System prompts database login successful, then click OK. 5. Select database server, then click menu DatabaseServer > View Database Resources. View Database Resources window pops-up, as shown in Figure 14.
32
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 4 Weekly Routine Maintenance
FIGURE 14 CHECKING DATABASE SPACE (2)
6. Click Tablespace Information and check Free percentage. END OF STEPS Result
Expected result Disk space of NetNumen M31 server is no less than 800 M. And the free space of each disk should be 20% of the whole space. Otherwise, it is necessary to delete some unused files.
Checking Shared Folders on Client Short Description
Context Steps
When system has useless shared folders, it causes additional burden to the system, and security problem is also likely to happen. So check and cancel shared folders regularly. Perform the following steps to cancel the shared folders: 1. Click Start > My Computer > Shared documents to pop-up Shared Document window. 2. Check for shared directories and cancel the unnecessary sharing. END OF STEPS
Result
No unnecessarily shared folder exists.
Confidential and Proprietary Information of ZTE CORPORATION
33
NetNumen M31(RAN) Maintenance Guide
Updating Antivirus Software Short Description Prerequisites Context Steps
This topic describes the procedure to update antivirus for weekly routine maintenance. McAfee antivirus software is installed on the system. To update antivirus, perform the following steps. 1. Right-click icon on taskbar and then click About from popup menu on Windows taskbar. FIGURE 15 VIRUS UPDATE INFORMATION
If
Then
Last update time is within a week
No need to update
Last update time is not within a week
Proceed with step 3
2. Check Last update time and then click OK on ePolicy Orchestrator Agent dialog box, as shown in Figure 15. 3. Right-click icon and then click Update Now from popup menu on Windows taskbar. McAfee AutoUpdate window pops up, as shown in Figure 16
34
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 4 Weekly Routine Maintenance
FIGURE 16 MCAFEE AUTO UPDATE DIALOG BOX
Note: Links for downloading latest virus definitions files for upgrading are as follows. McAfee Enterprise antivirus software can be downloaded from the website: http://www.mcafeesecurity.com/cn/downloads/default.asp. END OF STEPS
Checking Server Progress Status Context
To Ensure the normal operation of NetNumen M31(RAN) management system server.
Steps
1. Open Unified network management system console of NetNumen M31(RAN) server, and check for the abnormal message, such as abnormal reboot progress etc., as shown in Figure 17.
Confidential and Proprietary Information of ZTE CORPORATION
35
NetNumen M31(RAN) Maintenance Guide
FIGURE 17 CHECK SERVER OPERATION STATUS
Note: If NetNumen M31 requires NMS (Network Management System) connection, user must check CORBA notification service progress. 2. Execute #ps -ef|grep java command on server to check server operation status. END OF STEPS Result
Server progress starts without error.
Backing up Data Context
36
Data backup in NetNumen M31 is mostly about backup alarm data (notification alarm and history alarm), performance data, log data, configuration data. When the alarm data, performance data and log data reaches certain amount, they would occupy too much database table space and hard disk space and it effects normal operation of the system. Back up data has the following purposes: �
Quick restoration for inspection and review.
�
Clear data after taking backed up to release storage space.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 4 Weekly Routine Maintenance
Note: The alarm data being backed up includes history alarm and alarm notification. The performance data being backed up include raw data, hour data and weekly data. The log data being backed up includes operation log, secure log and system log. The backup of data above are performed by manually or automatically table collection task, in which automatically backup is on periodical frequency. Besides table collection task definition, user may define scheme to clean data. Perform the following procedure to back up the data: Steps
1. For detailed information about how to back up alarm data, performance data and log data, refer to Database Tables Backup and Recovery 2. For configuration data backup, refer to CM Data Backup and Recovery. END OF STEPS
Result
Backup completes without error. �
Check system log in log management, check for task execution status. For detailed operation refer to specific operation.:
�
Check save path of data backup for backed—up data.
Checking Server Error Logs Context
Check server error logs to ensure to find the system running accidents in time. Error log record information will be used in system operation analysis. Perform the following steps to check server error logs:
Steps
1. server error record is saved in server log file ( installation directory /ums-svr/log). 2. If dual system is used for backup server, user needs to backup log file in backup server. 3. Use FTP tool to download log file backed up to local client. 4. In addition to the dual-computer server, it is also necessary to check if there are any latest error log files in the log directory of the NM server. 5. Feed back the error log file (if any) to the local ZTE office immediately. END OF STEPS
Result
Inspection Criteria Deal with the error log files properly.
Confidential and Proprietary Information of ZTE CORPORATION
37
NetNumen M31(RAN) Maintenance Guide
Checking Server Disk Storage Capacity Context
Perform this procedure to ensure enough disk space of NetNumen M31 server. and check server file system space usage.
Steps
1. On System Management click Application Server > Server performance Server performance window appears as shown in figure to view real-time conditions of CPU, memory and hard disk usage percentage, as shown in Figure 18. FIGURE 18 CHECK SERVER DISK STORAGE CAPACITY (1)
2. Or execute #df –k command on server to check disk usage. 3. If dual system server is used, perform the same operation on the other computer to check disk usage. Free disk space on dual system server cannot be less than 500M. If free space is less than 500M, user should backup related files and clear files which are taken backup. END OF STEPS
38
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 4 Weekly Routine Maintenance
Checking Client Hard Disk Space Context
Check the free space on client terminal hard disk to ensure normal operation of the system. The steps to check disk space in Windows operation system is explained below.
Steps
1. Open My Computer and right-click on each hard disk drive and select Properties from popup menu. Properties window appears in properties window user can check total space and free space of each hard disk. 2. Check each disk for unused files and folder to delete. The operation system disk and application program disk should have at least 800 M free space. Other disks must have free space not lower than 20% of disk space. If it is not, deleting useless files is necessary. END OF STEPS
Confidential and Proprietary Information of ZTE CORPORATION
39
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
40
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
5
Monthly Routine Maintenance Table of Contents Monthly Routine Maintenance Items .....................................41 Checking Power Supply Voltage ...........................................42 Cleaning Dust on Cabinet ...................................................42 Checking Table Space Usage ..............................................43 Checking Performance Statistic Result of Operation Network ......................................................................................44 Checking Dual System Operation Status ..............................45 Monitoring Test .................................................................46
Monthly Routine Maintenance Items Monthly routine maintenance items of NetNumen M31(RAN) system is as shown in Table 8. TABLE 8 MONTHLY ROUTINE MAINTENANCE CHECKLIST Test Item
Category Environment monitoring maintenance
Power supply voltage. Clean Equipment. Monitoring Test, dual-computer running status. Check table space usage.
Running Status of Main Equipment
Check data secondary backup. Check Performance Statistic Result of Operation Network. Checking operational status of the fans.
Confidential and Proprietary Information of ZTE CORPORATION
41
NetNumen M31(RAN) Maintenance Guide
Checking Power Supply Voltage Context
Perform this procedure to ensure that power supply is in good condition.
Steps
1. Check voltage and frequency values of the power supply equipment. END OF STEPS
Result
The nominal value of the input power is 220V, single phase AC power supply. The allowed input voltage ranges from 176V to 264V AC, and the frequency ranges from 45Hz to 65Hz. Power supply range of NetNumen M31(RAN) system is as shown in Table 9. TABLE 9 POWER SUPPLY RANGE OF NETNUMEN M31 SYSTEM Parameter
Specific Index
Operating power supply
220V, 50Hz
Voltage range
176V~264V
Voltage frequency range
45Hz~65Hz
Cleaning Dust on Cabinet Short Description
Prerequisites Context Steps
Dust has serious negative effect on computer and PCBs as excessive dust accumulation could cause short circuit to electronic components. This topic describes the procedure to check the dust on cabinet for weekly routine maintenance. Check antistatic wrist strap is available in equipment room. To check dust on cabinet, perform the following steps. 1. Wear an anti-static wrist strap. 2. Check the dust on surface of server and cabinet. 3. Check dust in gaps between PCB boards, disk array and cabinet. 4. If the cabinet, server and the surrounding environment is dusty, clean them thoroughly. END OF STEPS
Postrequisite
42
If dust is present on the cabinet, use cleaner or soft brush to clean the dust.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 5 Monthly Routine Maintenance
Note: Clean the equipment on regular basis.
Checking Table Space Usage Short Description Context Steps
Check database space, prevent insufficient database table space from jeopardizing system operation. Perform the following steps to check table space usage: 1. Login NetNumen M31(RAN) client, from main window of NetNumen click View > System Management. 2. System Management window opens. Right-click database server to and select Database Login from pop-up menu. 3. The system pops up Database Login dialogue box, then input password and click OK button, as shown in Figure 19. FIGURE 19 DATABASE LOGIN WINDOW
4. The system prompts Database Login successful . Click OK to confirm. 5. Select target database server and click DatabaseServer > View Database Resource from main menu. The system pops-up View Database Resource dialogue box, as shown in Figure 20.
Confidential and Proprietary Information of ZTE CORPORATION
43
NetNumen M31(RAN) Maintenance Guide
FIGURE 20 VIEW DATABASE RESOURCE(2)
6. Click Table Space Information tab and to view free space percentage. Usage percentage of any table space is not higher than 90. If table space is not enough, increase the space assigned to data file, or backup and delete useless data record. END OF STEPS
Checking Performance Statistic Result of Operation Network Context
Check files generated in timer query task, check relevant performance, and backup according to requirements.
Steps
1. To check task for expiry date and check task execution when error occur. For procedure and more details, refer to Checking Scheduled Query Task topic. 2. Check the detailed information of the file generated, such as file name, start time and end time, as shown in Figure 21.
44
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 5 Monthly Routine Maintenance
FIGURE 21 CHECK PERFORMANCE STATISTIC RESULT NETWORK
OF
OPERATION
3. The report file generated is saved in path:..\ZXM-INOS\ums -svr\deploy\runtime\platform\csp-pm-osf\querytask. User may determine system operation status by analyzing performance statistic report generated. 4. Set up performance statistic backup directory on the hard disk of net manager client to backup details of monthly data. Backup in other media (such as MO disk) is applicable if necessary. (optional). END OF STEPS
Checking Dual System Operation Status Short Description
Context Steps
User may restart or shut down the system for some particular purpose. For different operation objects, the methods of restarting or shutting down are different. For a backup node, user may restart or shut down directly. But for master node, user must switch over to global resource group to back up system before restarting or shutting down master node. This ensures NetNumen application to provide non-stop service. Manual shutting down or restarting does not switch over to global resource group automatically. Procedure for checking dual system operation status: 1. Open server cabinet, inspect the front panel of each server equipment for red indicator alarm. 2. Check each server monitor for application abnormal alarm. 3. On server, execute #ps -ef|grep java to check server operation status.
Confidential and Proprietary Information of ZTE CORPORATION
45
NetNumen M31(RAN) Maintenance Guide
4. Execute #hagui & command log in VCS GUI to check proper logging in operation into VCS. 5. On server terminal, execute # vxprint -htr command, to check system general status. 6. On server terminal, execute #vradmin -gdgnetnumen1rep statusrvgnetnumen command, to check data copy status. 7. On server terminal, execute # vxdisk list command, to check hard disk status. 8. On server terminal, execute # vxprint -lP command, to check link status. 9. On server terminal, execute # vxdisk -o alldgs list disk group command, to check disk group status. 10. On server terminal, execute # vxprint -l lvomc command, to check volume status.
Note: For more details about dual system operations, please see NetNumenM31 (RAN) (V3.10.410) Mobile Network Element Management System High Availability Feature Description Manual. END OF STEPS Result
Expected result: Dual system is working normally, there are no alarms.
Monitoring Test Short Description
Context Steps
46
Monitoring test is a crucial job in daily operation and maintenance. Maintenance operators can check if NetNumen M31 runs normally through the monitoring test system to locate and process faults in time and ensure the reliable running of NetNumen M31. Monitoring test provides instant monitoring test for all the processing procedures of NetNumen M31, which may help to prevent potential troubles. To monitor test, perform the following steps: 1. Select Help > About item on the menu bar of NetNumen M31 client, the About dialog box pops up, switch to the Information tab, as shown in Figure 22.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 5 Monthly Routine Maintenance
FIGURE 22 ABOUT INTERFACE
2. Click the button Memory monitor, the Memory Monitor dialog box pops up, as shown in Figure 23. Click the button Force GC to call back memory occupied by JAVA running programs forcedly. GC is garbage memory collection. FIGURE 23 MEMORY MONITOR
3. Suggestions for before doing this test: i.
Because monitoring test brings more load to NetNumen M31 system, it is recommended to arrange the monitoring test at about 2 o’clock AM once in a month to avoid traffic peak.
Confidential and Proprietary Information of ZTE CORPORATION
47
NetNumen M31(RAN) Maintenance Guide
ii. Analyze the result of monitoring test. If the faults and abnormalities that cannot be located, please save the monitoring results and contact the local ZTE office for technical support. END OF STEPS Result
Expected Result: The monitoring results have no abnormality. If they have, handle the trouble as soon as possible, and remove hidden troubles and faults.
48
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
6
Quarterly Maintenance Table of Contents Quarterly Maintenance Items ..............................................49 Checking Dual Server System Switching ...............................50 Checking Ground Resistance ...............................................53 Clearing History Alarm .......................................................53 Deleting Performance Database ...........................................55 Checking Unauthorized Access of Server...............................57 Checking Firewall .............................................................57 Checking LAN Equipment....................................................58
Quarterly Maintenance Items Table 10 shows the quarterly maintenance items of NetNumen M31(RAN). TABLE 10 LIST
OF
WEEKLY MAINTENANCE ITEMS
Category
Check items Check grounding resistance Check the dual-system switch. Check history alarm automatic cleaning function for its normal operation.
Environment monitoring & maintenance
Check performance data automatic cleaning function for its normal operation. Check client for unauthorized access. Check server for unauthorized access. Check the LAN hardware. Check firewall installation configuration.
Confidential and Proprietary Information of ZTE CORPORATION
49
NetNumen M31(RAN) Maintenance Guide
Checking Dual Server System Switching Short Description
Occasionally users need to shutdown or reboot the system. Shutdown or rebooting of backup node system can be done directly. Shutdown or rebooting master system requires Global Resource Group to be switched to backup system, in order to ensure smooth running of netnumen applications. This is because manually shutting down or rebooting system does not trigger automatic switch to Global Resource Group. So it is very important to confirm Dual-system can switch normally, and operations are normal after switching. It is recommended check this operation around 1:00 am.
Context
Perform the following steps to check dual server switching system:
Steps
1. Check backup server and master server configuration file, operation file to be consistent, perform backup to current router information configuration with key check point in the normal routing to lower level net manager. It should be able to communicate by pinging. 2. Shutting down main server node and switching to backup node is explained below. For more details about dual system operations refer to NetNumenM31 (RAN) (V3.10.410) Mobile Network Element Management System High Availability Feature Description Manual . i.
Right-click global resource of NetNumen and select Switch to > Remote Switch from popup menu, as shown in Figure 24. FIGURE 24 CHECK DUAL-SYSTEM SWITCHING (1)
3. Switch global group window appears. Select target Cluster and System from drop-down lists respectively, as shown in Figure 25. Click OK.
50
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 6 Quarterly Maintenance
FIGURE 25 CHECK DUAL-SYSTEM SWITCHING (2)
4. Question window appears for confirmation. Click Yes for confirmation, as shown in Figure 26. FIGURE 26 CHECK DUAL-SYSTEM SWITCHING (3)
5. In status tab of Cluster Administrator window user can observe that cluster status. Figure 27 shows that Global Resource Group netnumen in Cluster clsA1a is getting offline and Figure 28 shows that Global Resource Group netnumen in Cluster clsA1a is offline. FIGURE 27 CHECK DUAL-SYSTEM SWITCHING (4)
Confidential and Proprietary Information of ZTE CORPORATION
51
NetNumen M31(RAN) Maintenance Guide
FIGURE 28 CHECK DUAL-SYSTEM SWITCHING (5)
6. After Cluster clsA1a innetnumen resource group is completely offline, the resource group in Cluster clsB1b in netnumen starts getting online, the interface after completing is as shown in Figure 29. FIGURE 29 CHECK DUAL-SYSTEM SWITCHING (6)
7. Observe for 30 minutes. 8. Reboot backup node EMSRANB1b when test result is normal. Observe if the server gets started normally. 9. Use the similar method to switch to the original master node EMSRANA1a. 10. Inspection Criteria :
52
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 6 Quarterly Maintenance
i.
Switching can be performed from master to backup node properly, and also services can be taken over between the dual servers properly.
ii. Routing to lower level net manager is correct, lower level net manager could be pinging through. i.
The original client (modify link server as new IP address) can log into the server after switching. Exception Processing If it is detected cannot switch normally, it needs to analyze the reason quickly and check the log record, find the resource that cannot switch, then inform ZTE local office.
END OF STEPS
Checking Ground Resistance Context Steps
Perform this procedure to ensure ground resistance is within safe range. 1. Test the connection resistance of the rack. 2. Test the ground resistance in the equipment room. 3. Check the protective grounding wires. 4. Check the space between the protection ground and surrounding signals as well as the space between the power supply and the ground. END OF STEPS
Result
The expected results are as follows: �
The connection resistance range of the rack must be 0.1 Ω ~ 0.3 Ω.
�
The ground resistance in the equipment room must be less than 1 Ω.
�
If conditions permit, the PGND is connected to the ground through an independent grounding post.
�
Sufficient space must be reserved between the PGND and surrounding signals and between the power supply and the ground, preventing any damage to boards due to high voltage.
Clearing History Alarm Short Description
Clear historical alarms that are generated in last quarter to lessen database load.
Confidential and Proprietary Information of ZTE CORPORATION
53
NetNumen M31(RAN) Maintenance Guide
Steps
1. In database Table Collection Operation, find the table collection task to be checked, as shown in Figure 30. It will give the table collection task name and execution interval period. FIGURE 30 TABLE COLLECTION OPERATION WINDOW
2. Click View > Log Management to enter Log Management window. 3. From left pane of Log Management window right-click on system log and select query system log from popup menu. 4. Query system log window appears. 5. In LogName field select database table collection task from drop down menu, as shown in Figure 31. FIGURE 31 QUERY SYSTEM LOG
6. Click OK , check result is displayed in log list, as shown in Figure 32.
54
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 6 Quarterly Maintenance
FIGURE 32 LOG RECORD
7. Task implementation period can be seen from the figure and it can be compared with the table collection task period recorded in the first step of verify. 8. Double-click log to examine log detailed content, check for abnormal record. No abnormal record table means task is executed successfully. END OF STEPS Result
Inspection Criteria History alarm automatic cleaning task is executed normally.
Deleting Performance Database Short Description
Regularly delete database performance data may improve database operation speed.
Steps
1. In database Table Collection Operation, find the table collection task to be checked, as shown in Figure 33. It will give performance management data backup - RNC NE - Raw data table collection task name and execution interval period.
Confidential and Proprietary Information of ZTE CORPORATION
55
NetNumen M31(RAN) Maintenance Guide
FIGURE 33 TABLE COLLECTION OPERATION WINDOW
2. Click View > Log Management to enter Log management window. 3. From left pane of Log Management window right-click on system log and select query system log from popup menu. 4. Query system log window appears. 5. In LogName field select performance management data backup - RNC NE - raw data table from dropdown menu, as shown in Figure 34. FIGURE 34 QUERY SYSTEM LOG
6. Click OK, check result is displayed in log list, as shown in Figure 35. FIGURE 35 LOG RECORD WINDOW
56
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 6 Quarterly Maintenance
7. Task implementation period can be seen from the figure and it can be compared with the table collection task period recorded in the first step of verify. 8. Double-click log to examine log detailed content, check for abnormal record. No abnormal record table means task is executed successfully. END OF STEPS Result
Expected Result: The Performance data that are generated in last quarter are cleared.
Checking Unauthorized Access of Server Short Description Steps
Check server for unauthorized access. 1. Execute the following commands as the root user. #prstat 2. Check and display information about active processes in system, look for abnormal process. Default setting of all information is sorted by the CPU occupancy rate. END OF STEPS
Result
Inspection Criteria No abnormal progress.
Checking Firewall Short Description Context Steps
Check firewall installation and configuration to prevent the illegal access. Checking procedure is as follows: 1. Check whether firewall whether it meets the ZTE standard. 2. Check whether firewall satisfy the needs of current network security, for example: all intranet address to pass other networks must make NAT conversion; Close all unused ports. END OF STEPS
Result
Inspection Criteria. Firewall is properly installed and configured.
Confidential and Proprietary Information of ZTE CORPORATION
57
NetNumen M31(RAN) Maintenance Guide
Checking LAN Equipment Short Description Context Steps
This topic describes the procedure to check LAN equipment for routine maintenance. To check LAN equipment, perform the following steps. 1. Check indicators on switch and router is normal. 2. Check network setting are correct. 3. Check network cable connection with server/maintenance console. 4. Check if the hard disks of dual-server and disk array in the cabinet have fault indications. (If the red light flashes, it indicates that the hard disk has fault). END OF STEPS
58
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
7
Maintenance Record Form Table of Contents Daily Maintenance Form .....................................................59 Weekly Maintenance Form ..................................................60 Monthly Maintenance Form .................................................62 Quarterly Maintenance Form ...............................................63
Daily Maintenance Form Recommended format of daily maintenance form is shown in Table 11. TABLE 11 DAILY MAINTENANCE RECORD FORM Date: (MM-DD-YY)
Office Name On-duty time: Till o’clock
Off-duty person:
On-duty person:
Basic check items 1. Check the normality of the board indicator. Open the front door of the NetNumen M31 (RAN) system, and check the availability of the red-indicator alarm (check the module) on the boards. [ ] Normal [ ] Abnormal Abnormal descriptions: 2. Check the normality of the server indicator. Open the front door of the server cabinet, and check the availability of any alarm indication on the server; switch the servers, and check if there is any program running the abnormal alarm prompt box on the window. [ ] Normal [ ] Abnormal Abnormal descriptions: 3. Query of the module alarm information. Enter Alarm management at the maintenance terminal, check current alarms and history alarms for abnormities.
Confidential and Proprietary Information of ZTE CORPORATION
59
NetNumen M31(RAN) Maintenance Guide
Date: (MM-DD-YY)
Office Name On-duty time: Till o’clock
Off-duty person:
On-duty person:
[ ] Normal [ ] Abnormal Abnormal descriptions: 4. Check the backup of NetNumen M31RAN data configuration. Before doing any operation in data, user must do backup. Before any data operations, the data must be backed up. [ ] Normal [ ] Abnormal Abnormal descriptions: 5. Check the performance statistic data Check whether the CPU utilization (%), (each module) and memory (MEM) utilization (%) are normal. [ ] Normal [ ] Abnormal Abnormal descriptions: Equipment room environment 1. Temperature (normal: 15℃~25℃) [ ] Normal [ ] Abnormal 2. Humidity (normal: 30%~70%) [ ] Normal [ ] Abnormal 3. Dustproof conditions (good, bad): [ ] Normal [ ] Abnormal Unresolved problems: Checked by monitor:
Weekly Maintenance Form Recommended format of weekly maintenance form is shown in Table 12. TABLE 12 WEEKLY MAINTENANCE RECORD FORM Office name:
Date: (MM-DD-YY)
Maintenance project 1. History alarm backup: Since the system can automatically clear the alarm log when the alarm logs in the alarm log database grow into a certain extent, it is recommended to periodically back up the alarm record database. This can provide certain reference for troubleshooting and awareness of networking running. Maintenance result:
60
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 7 Maintenance Record Form
Office name:
Date: (MM-DD-YY)
Maintenance personnel: ............................................... Time: 2. O & M system log backup: Operation & Maintenance log of NetNumen M31 (RAN) every week. User can use a date, such as 20041201, to name a backup file. At the same time, it is also required to clear operation and maintenance records in the log file base. Maintenance result:
Maintenance personnel: ............................................... Time: 3. Periodical data backup: Maintenance result: Maintenance personnel: ............................................... Time: 4. Check whether the static data and statistical data are set correctly. The maintenance personnel should check if the performance statistic task expires. Maintenance result: Maintenance personnel: ............................................... Time: 5. Check viruses on the server at the background. Make sure to select Check instead of killing virus directly, lest the file should be deleted, thus causing system fault and interrupting normal services.
Maintenance result: Maintenance personnel: ............................................... Time: 6. Check the remaining space of each database on the database server, for example, the available database space and the available log space. Maintenance result: Maintenance personnel: ............................................... Time:
Maintenance result: Maintenance personnel: ............................................... Time: 7. Adjust system time: Calibrate system time on Net Manager server per clock source. Maintenance result: Maintenance personnel: ............................................... Time:
Confidential and Proprietary Information of ZTE CORPORATION
61
NetNumen M31(RAN) Maintenance Guide
Office name:
Date: (MM-DD-YY)
Problems and solutions: Unresolved problems:
Monthly Maintenance Form Recommended format of monthly maintenance form is shown in Table 13. TABLE 13 MONTHLY MAINTENANCE RECORD FORM Office name:
Date: (MM-DD-YY)
Maintenance project 1. Check power supply voltage: Check power supply voltage every month. The normal value of the input power is 220V, single phase AC power, with the allowed input voltage ranging in176V ~ 264V AC, and the frequency in 45Hz ~ 65Hz. Maintenance result: Maintenance personnel: ............................................... Time: 2. Clean the PC surface. Maintenance result: Maintenance personnel: ............................................... Time: 3. Performance statistic: monthly statistic report is generated every month. The maintenance personnel should back up the statistic data every month. Maintenance result: Maintenance personnel: ............................................... Time: 4. Check dual-computer switchover. The maintenance personnel should check dual-system switchover every month. Maintenance result: Maintenance personnel: ............................................... Time: Problems and solutions: Unresolved problems:
62
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 7 Maintenance Record Form
Quarterly Maintenance Form Recommended format of Quarterly maintenance form is as shown in Table 14. TABLE 14 QUARTERLY MAINTENANCE RECORD FORM Office name:
Date: (MM-DD-YY)
Maintenance project 1. Ground resistance check: check grounding resistance on quarterly basis. Machine room grounding resistance should be within 1Ω. Ther should be sufficient spacing between protective ground and surrounding signal, power source and ground. Maintenance result: Maintenance personnel: ............................................... Time: 2. History alarms handling. Maintenance result:
Maintenance personnel: ............................................... Time: 3. Clear the performance statistics library every three months. Maintenance result: Maintenance personnel: ............................................... Time: 4. Test each basic function of the system. Maintenance result: Maintenance personnel: ............................................... Time: Problems and solutions: Unresolved problems:
Confidential and Proprietary Information of ZTE CORPORATION
63
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
64
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
8
Common Alarms and Faults Handling Table of Contents Overview..........................................................................65 Equipment Alarms .............................................................66 Common Faults Handling ....................................................69
Overview There are four recommended alarm levels. The four alarm levels, which is displayed in the Severity Column of the alarm template, are indicated in descending order of severity as Critical, Major, Minor and Warning. �
Critical Critical alarm affects all services and all NE resources.
�
Major Major alarm affects most services and resources.
�
Minor Minor alarm hinders the normal operation of some services.
�
Warning Warning alarm affects a certain service or resource.
There is no level classification for notification.
Caution: Faults of the actual version may differ from what is described in this manual due to frequent update of ZTE products and fast development of technologies. If you have found any fault that is not listed in this manual, please contact the local ZTE office. This chapter also introduces the troubleshooting to common faults of NetNumen M31 (RAN) system.
Confidential and Proprietary Information of ZTE CORPORATION
65
NetNumen M31(RAN) Maintenance Guide
Equipment Alarms Communication Alarms with Lower-level NM One communication alarm is described: Access Managed Object (AMO) link broken. Symptom
The broken AMO link causes the communication interruption.
Related Part
NM server, lower-level NM interface machine and connected links are involved.
Cause
The network cable may be disconnected; or SWITCH, interface machine, network adapter of NM server fails. Interface machine is shutting down or starting.
Handling
Check if the interface machine is running properly, if the network cable and SWITCH work normally as well as if the network adapter of interface machine is normal.
Verification
Communication links work normally.
DB Space Insufficient Symptom Related Part Cause
Handling
Database space for NetNumen M31 alarm and performance statistical data storage is insufficient. Database server. Causes of insufficient database space are as follows: �
Data have not been backed up for a long time and the previous backup files have not been deleted.
�
Disk capacity configured is too low due to underestimation on the traffic load.
To handle the problem, perform the following steps: 1. Check if periodic data backup is set and the backed up files are deleted. To free the disk space immediately, backup files and delete them manually. 2. Contact the local ZTE office for upgrading the disk to a higher capacity.
66
Verification
The alarms disappear.
Precautions
When the alarm occurs, system can still run; however, system performance is affected. It is recommended to handle the problem in a timely manner. Alarms disappear once the disk space is freed.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
CPU Occupancy of Application Server above Threshold Symptoms
CPU occupancy of application server exceeds threshold.
Related part
NetNumen M31 application server.
Cause
Causes of the fault are as follows:
Handling
�
It may be alarm rush hours and therefore data to be processed may be huge.
�
Performance data collection conflicts with alarm peaks, causing the server’s processing over threshold.
�
Traffic is underestimated and hence CPU equipment is rather poor.
To handle the problem, perform the following steps: 1. Check if performance data collection is reasonable and feasible. If not, adjust the collecting time to avoid alarm peaks. 2. Check if it is during alarm burst period. 3. Contact ZTE local office to expand or upgrade system configuration.
Verification
The alarm disappears.
Precautions
When the alarm occurs, system can still run; however, system performance is affected.
Memory Occupancy of Application Server over Threshold Symptom Related Part Cause
Handling
Memory occupancy of application server exceeds preset threshold. Application server. Causes of the faults are as follows: �
Processing Too many services may occupy application server’s memory.
�
The JVM setting of the server is too low.
�
Memory configured is small due to underestimated traffic load.
To handle the problem, perform the following steps: 1. Check the alarm and performance data processed by system. 2. Modify JVM configuration. 3. Contact ZTE local office to upgrade application server’s memory.
Verification
The alarm disappears.
Precautions
When the alarm occurs, system can still run; however, system performance is affected.
Confidential and Proprietary Information of ZTE CORPORATION
67
NetNumen M31(RAN) Maintenance Guide
Space for Application Server Logs Insufficient Symptom Related Part Cause
Handling
Space for application server logs is insufficient. Application Server Causes of the fault are as follows: �
Logs have not been backed up and clean for a long time.
�
Log space is set too small.
To handle the problem, perform the following steps: 1. Back up logs regularly and timely and remember to delete log files. 2. Enlarge log space. 3. Enable the auto-deletion mechanism so that logs can be automatically deleted or discarded once it exceeds the preset saving time.
Verification
The alarm disappears.
Precautions
It is recommended to adopt the mechanism of backing up and then deleting because auto-deletion mechanism may delete some critical logs which may not be able to be retrieved. Only part of the logs within preset duration can be saved.
Catalog Capacity over Threshold Symptom Related Part Cause
Handling
The monitoring catalogue space exceeds preset capability. Application server. Causes of the fault are as follows: �
The monitoring catalogue space rises fast.
�
The space preset for monitoring catalogue is small.
To handle the problem, perform the following steps: 1. Reset monitoring catalogue capability, and increase the threshold. 2. Back up monitoring catalogue, and delete backed up files.
68
Verification
The alarm disappears.
Precautions
When the alarm occurs, system can still run; however, system performance is affected. It is recommended to handle the problem timely.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
Common Faults Handling AMO Startup Error Symptom Cause Related Part Fault Analysis and Location Troubleshooting
After AMO installation, system reports error during startup, causing the lower network is unable to be managed. “AMO cannot be started up” error. AMO management in Topology Management Property information of AMO is inconsistent with that provided by interface machine. To handle the fault, perform the following steps: 1. Try to learn information about inferior NM interface, such as IP address, configuration port no., alarm port no, performance port no, FTP user name and password used to connect interface machine. 2. Compare it with the property information of the current AMO and make sure they are consistent. 3. Restart AMO.
Alarm Forwarding Failure Symptom Cause Related Part Fault Analysis and Location
Alarms cannot be forwarded to corresponding mailbox or mobile phone for processing. Alarm forwarding service is set incorrectly. Fault management. Analyze and locate the fault by performing the following procedure: �
Mail forwarding Set the following file: ums-svr\platform\psl\uep-psl-forwardinfo.par\mailserver.propertieswhere “mailServerAddress=” is used to configure SMTP server. i.
If forwarding fails, it is recommended to check whether IP address, user name, and password of SMTP server are configured correctly.
ii. If they are correct and forwarding still fails, install winmail on a new server, a mail server, to test again. �
SMS forwarding Set the following file: serialgsmodem.properties where “serials=” refers to the serial port used for forwarding.
Confidential and Proprietary Information of ZTE CORPORATION
69
NetNumen M31(RAN) Maintenance Guide
Hard Disk Space Insufficiency Symptom Cause Related Parts
Fault Analysis and Location
Troubleshooting Method
Hard disk space is insufficient. “Hard disk space insufficient” alarm. Related parts of the fault are as follows: �
Fault Management
�
Performance Management
Analyze and locate the fault by performing the following procedure: �
The alarm files are not backed up and the backup files are not deleted for a long time.
�
Alarm and performance data are underestimated, and therefore small hard disk is configured.
To handle the fault, perform the following steps: �
Check if periodic alarm files backup is set and the backup files are deleted. To free the disk space immediately, backup and delete the alarm files manually.
�
Contact the local ZTE office for upgrading the disk capacity.
Client Fails to connect with Server Symptom Possible Cause
Troubleshooting Method
Client Fails to connect with Server. Login failures of client with server may be caused by the following: �
Client is not connected to server.
�
Client version is inconsistent with version of server.
�
Language settings of client is inconsistent with that of server.
�
Hard disk is out of free space.
1. Check the network connection between client and server. 2. Check version consistence between Client is and Server. 3. Check the language configuration consistency between deploy-150minos.properties and deploy-010muep.properties of client with that of server. Chinese version: ums.locale=zh_CN English version: ums.locale=en_US 4. Check hard disk free space and delete files not in use.
Server Unable to Start Symptom Possible Cause
70
NetNumen M31(RAN) server unable to start. NetNumen M31(RAN) unable to start may be caused by the following:
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
Troubleshooting Method
�
Oracle databse application is not started correctly.
�
Java virtual machine is not started correctly.
�
Server configuration file is not correct.
�
Server starting file is without execution permission.
1. Run sqlplus sys/password@SID as sysdba command in command prompt to check connection with Oracle database. If connection fails, check installation instance of database, or listener of database to determine normal operation of database. 2. Check the decompression of files in directory jdk-solaris for complete decompression of all files. 3. Check the settings of server configuration files. Check the IP server settings in deploy-150minos.properties and deploy-010muep.properties. 4. Authorize server starting file of execution permission. Execute the following commands as the root user. # chmod -R 755 installation directory /ZXM-INOS/ums -svr/bin
Database Unable to Start Symptom Possible Cause
Trouble shooting (Oracle Database)
Database unable to start. Database unable to start might be caused by the following: �
Database is already in operation.
�
Database is not properly configured which causes insufficient resource for the start of database.
�
System data file required in starting process corrupted.
�
User data file corrupted during starting database.
�
Hard disk has no free space.
1. Run sqlplus sys/password@SID as sysdba in command prompt and check the connection with Oracle. 2. If Database does not start properly, the result will be showed on screen. If system prompts database is in operation, please stop current database before running a new database. If system prompts system data file corrupted, please reinstall database instance. If system prompts set system parameter, modify the parameter in $ORACLE_HOME/dbs/initSID.ora. Usually user may modify share pool. Save the modification and restart database. 3. Run df –k command to check disk free space. When disk space has no free space, user may delete some useless files or move them to other locations. Such as a large file in the data file disk could be moved to other hard disks.
Confidential and Proprietary Information of ZTE CORPORATION
71
NetNumen M31(RAN) Maintenance Guide
4. If database starts normally, but the database can not be connected, please check database listener for starting of the process. Run lsnrctl start to start database listener process.
Data Table Space Full Symptom
Database table space is full For example in performance management, all tables are emptily, no data is obtainable.
Possible Cause
NetNumen M31(RAN) server performance table space or performance index table space is small, so volume of performance data reported to surpass current table space maximum capacity. 1. Login NetNumen M31 client, and click view > system managementfrom standard menu bar. 2. System Management window pops up. Select the particular database from left pane and click DatabaseServer > Database Loginfrom main menu bar. 3. A Database Login box will pop up. Input password and click OK, usually password is “oracle”, as shown in Figure 36. FIGURE 36 DATABASE LOGIN WINDOW
4. System prompts database login is successful, then single click OK. 5. Select target database server, then click menu DatabaseServer > Table Collection Operation. Then system will pop-up a dialogue box Table Collection Operation as shown in Figure 37.
72
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
FIGURE 37 TABLE COLLECTION OPERATION WINDOW
6. Select a scheme in the left pane of Table Collection Operation, such as Wrnc Table Collection of Performance Management - Raw Data. This scheme is also shown in the right pane, double click it to see details, as shown above. 7. Right-click the scheme, shortcut menu pops-up , and then click Manually Execute. 8. Manually delete performance data in Database Table Collection operation, and adjust automatic clearing settings. The recommended preserves days is 7 days.
Alarm Box Unable to Display Audio-Visual Prompt Symtom Possible Cause Troubleshooting Method
Alarm Box Unable to Display Audio/Visual Prompt Alarm box connection failure. 1. First check the physical network connection of alarm box by ping command. This would find net cable problems or IP address conflict. 2. Change port No. of current alarm box to avoid conflicts with the port used by other purpose on NetNumen M31 server. 3. Check NetNumen M31 server settings of alarm box to ensure they are consistent with the settings on alarm box. If NetNumen M31 server has not configured this alarm box, the connection indicator would show disconnected status. 4. If alarm box and server are physically connected, NetNumen M31 server has also configured alarm box properly and set it activated status, the communication status window of alarm box on NetNumen M31 will show normal. Connection indicator on alarm box will show connected status. 5. If alarm box monitor shows connects status, but the NetNumen M31 interface shows communication status of alarm box is not normal, please check other NetNumen M31 servers for their settings to this alarm box and whether their communication status is normal.
Confidential and Proprietary Information of ZTE CORPORATION
73
NetNumen M31(RAN) Maintenance Guide
Performance Report Problem Caused by Incorrect Time Zone Settings Symptom
Date and time in performance reports are inconsistent with current time, performance data will display in two lines.
Fault Analysis
Time zone setting on client computer is inconsistent with that of NetNumen(TM) M31 server, for example server time zone is (GMT+05:30),client time zone is (GMT+08:00). System makes modification to time of time reports according to time zone settings, Due to performance reports time would be inconsistent with current time, performance data will display in two lines.
Troubleshooting
After time zone modification, performance data shows normally.
Chaotic Performance Alarm Data Caused By Synchronization Error between Upper and Lower Level Principle of Synchronization Symptom
Use NTP clock synchronization to retrieve clock data periodically from clock source and modify local time. Lower level net manager sets higher level net manager server as clock source. Performance data delay alarm appears. Alarm recovery time is earlier than alarm happening time.
Possible Cause
Troubleshooting Method
74
�
Lower level time is later than higher level time.
�
Lower level time being later than higher level time caused by
1. Check higher level net manager for clock service status. Modify the deploy-default.properties file in installation directory /ums-svr/deploy directory on NetNumen M31(RAN) server to set NTP service port number, as shown in Figure 38. Recommended setting is 21124.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
FIGURE 38 DEPLOY DEFAULT PROPERTIES — I
2. Check lower level net manager for its configuration of higher level net manager as clock source. . Modify deploy-default.properties file in OMM client installation directory /ums-svr/deploy directory , as shown in Figure 39. FIGURE 39 DEPLOY DEFAULT PROPERTIES —II
Confidential and Proprietary Information of ZTE CORPORATION
75
NetNumen M31(RAN) Maintenance Guide
Error When Reporting Lower Level Net Manager Configuration Data Symptom
NetNumen M31 has no configuration data. It makes OMM NE measurement object to be null when creating performance task.
Possible Cause
1. Error when reporting lower level net manager configuration data might be caused by the following: OMM and NetNumen M31 server communication link breaks.
2. After NetNumen M31 server restarts, it is carrying on synchronization with subordinate net manager, or it is in manual synchronization. Troubleshooting Method
1. Check OMM and NetNumen server network link situation. 2. Wait a bit moment for synchronization operation to complete, alarm will continue to report.
Error in Reporting Lower Level Net Manager Performance Data Symptom Possible Cause
Troubleshooting Method
NetNumen M31 alarm information is unable to report. NetNumen M31 alarm information unable to report might be caused by the following: �
OMM and NetNumen M31 server communication link breaks.
�
After NetNumen M31 server restarts, it is synchronizing with lower level, or is performing alarm synchronization.
1. Check network connection between OMM and server. 2. Wait a bit moment, and when synchronization operation completes, alarm report will continue.
Northbound Interface Link Interruption Symptom Cause
NMS cannot receive any notice. Corba notification service or network.
Related Part
Northbound alarm, northbound performance, northbound configuration, northbound file transmission interface.
Fault Analysis and Location
1. Possibly network interruption. Use ping command to check network communication. 2. Possibly because port is taken. Use telnet or netstat commands to check port occupation. 3. Possibly Corba is not started properly. Execute ps - ef | grep corba command to check Corba process.
76
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
4. Possibly because Corba is started later than NeNumen. In this case, Corba process exists but none of northbound operation is supported. 5. Possibly because NMS has failure. If the resaons 1–4 does not exist, search NMS object quotation in log files. It is usually similar to “IOR:3546576587…” format. And examine whether this object in log file is “not active” status. Troubleshooting Method
1. Check physics equipment (net cable, interface), network protocol, etc in turn, to ensure network communication is out of error. 2. Releases related port. 3. Start system and make sure Corba process is started before NetNumen M31 process. 4. NMS restarts or let NMS do self-check.
Failure in Reporting Configuration Data to NMS Symptom
Cause
Related Part Fault analysis location and handling:
�
No file generated after configuration object batch uploading.
�
Batch upload time of configuration data is very long, and conversation remains in running status for very long time without ending.
�
Batch uploading configuration data does finish, but session displays operation failure.
1. Read-write permission of target FTP. 2. Scale of configuration data itself. 3. Configuration data has problem and causes error in data conversion. FTP server, northbound configuration module. 1. If target FTP is NetNumen M31, then users needs to modify FtpServer.user.nmsftpuser.write as true in file ums-svr \ tools \ ftpserver \ conf \ naf-app-ftpuser.properties. If target FTP is not NetNumen M31, then user needs to provide a FTP with write permission. 2. Checks NetNumen M31 ums-svr \ tmp \ ftp \ naf \ unde rcm directory about its sufficient enough storage space, as batch configuration data will generate temporary file in this location. If this directory has massive temporary files, user may clear files in this directory to release disk space. 3. User may divide the uploading of configuration data into multiple partitions, for instance upload one subnet configuration data only in one operation. Record ID of subnet with upload failures, and save it at subnet with data conversion error. 4. Check configuration data on OMM, for example ID field of MO object is filled with Chinese character. 5. If single subnet upload also takes long time without returning (for instance over 2 hours), retrieve the log file in directory ums-svr \ log, pack the log file and send to development personnel for analysis.
Confidential and Proprietary Information of ZTE CORPORATION
77
NetNumen M31(RAN) Maintenance Guide
Error in Report Alarm Information to NMS Symptom Cause Related Part Fault Analysis and Location
Troubleshooting Method
NMS cannot receive any alarm information. Corba notification service, NetNumen M31 alarm management system. Northbound alarm, public, NetNumen M31 alarm. �
Check for northbound link is working normal.
�
Check whether EMS has normal alarm reporting, whether it has set up filtration rules.
�
Possibly Corba is not started properly. Execute ps - ef | grep corba command to check Corba process.
1. refers to Northbound Interface Link Interruption ensure northbound link is in normal operation. 2. Clear filtration rules to ensure normal alarm report. 3. Reboot system and make sure Corba process is started before NetNumen M31 process.
Failure in Reporting Performance Data to NMS Symptom Cause
If NMS fails to report Reporting Performance Data. NMS has not created related performance task or performance task be deleted or be suspended. Lower level has not reported related performance data. There is no data, so it does not report. Corresponding data storage space is full (divided into oralce performance table space and XML file storage space).
Related Part
Performance Management
Fault Analysis and Location
1. Check whether related task created by northbound existing in performance management interface. 2. Check whether data of related task has been reported onto NetNumen M31 in data check interface of performance management. 3. Check ORACLE table space and disk space.
Troubleshooting Method
1. Create NMS related performance task or activate related performance task. 2. Check OMM for related task, and determine whether it can report to NetNumenM31 in time. If it cannot, please refer to OMM Faults handling and processing to solve problem. If it can, backup the log and send to development personnel for analysis. 3. Three methods to solve ORACLE table space and disk space insufficiency. �
78
Increase table space.
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 8 Common Alarms and Faults Handling
� �
Increase disk storage capacity. Back up and delete old data and reduces file preservation time, Database Tables Backup and Recovery.
Confidential and Proprietary Information of ZTE CORPORATION
79
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
80
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
9
Data Backup and Recovery Table of Contents Log Files Backup ...............................................................81 CM Data Backup and Recovery ............................................85 Database Tables Backup and Recovery .................................93 All Users Database Backup and Recovery ............................ 103 Fault Management database backup................................... 103
Log Files Backup Logs come in different flavors, so in NetNume log file is one composed entirely of lines of text. Netnumen creates log files that record actions for operations like system startup, operations done by user and fault operation. The current Log file moved to the Zip folder automatically, after a specific duration has passed or a file size has been reached. The downside to having programs that can provide useful or verbose logging output is the amount of disk space this output can consume. This is a concern for all operating systems. In NetNumen system to save the disk space user can take log files backup and delete the all log files except current log file. Backup log file are useful during system crash or for trouble shooting. User can take log files backup periodically.
NetNumen Log Files Backup in Windows OS Context Steps
Perform the following steps to take log files backup from server and client: 1. Log files backup from NetNumen server: i.
Go to the Log directory from the directory where NetNumen system is installed as shown in Figure 40.
Confidential and Proprietary Information of ZTE CORPORATION
81
NetNumen M31(RAN) Maintenance Guide
FIGURE 40 NAVIGATING TO LOG FOLDER
ii. Except current log files, select all the files and zip it. Backup the zip file. iii. After backup of log file,s delete all the files including zip file except current log files from log directory shown in Figure 41. FIGURE 41 AFTER LONG FILES DELETION FROM LOG DIRECTORY
2. Recovering NetNumen Sever Log files:
82
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
�
�
When system crashes user can restore the log files from backup disk. During system trouble shooting technical personnel can refer log files from backup disk.
3. Procedure for log files backup and recovery in NetNumen client is same as NetNume server. END OF STEPS
NetNumen Log Files Backup in Solaris OS Context
Perform the following steps to take log files backup from server and client:
Steps
1. Start the Solaris system as administrator. Change the directory into /ZXM-INO/us-ser/ using CD command as shown in Figure 42. FIGURE 42 CHANGING THE DIRECTORY IN SOLARIS SYSTEM
2. Archive the files in the log directory to a tar format file called log.tar as show in Figure 43.
Confidential and Proprietary Information of ZTE CORPORATION
83
NetNumen M31(RAN) Maintenance Guide
FIGURE 43 ARCHING FILES IN TAR FORMAT
3. After completion of tar command log.tar file can be seen in ums-svr directory as shown in Figure 44. FIGURE 44 LOG.TAR FILE IN UMS-SVR DIRECTORY
4. Compress and archive the file log.tar as shown in Figure 45.
84
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 45 ZIP LOG.TAR FILE
5. Backup the log.tar.gz file. After taking backup, delete all the file including log.tar.gz file except current log files. END OF STEPS
CM Data Backup and Recovery Configuration management data backup is used during system crash. In the case of single site failure user can recover the configuration management data from backup disk and make the system alive in short duration. In the case of entire OMM failure also user can recover the entire OMM configuration management data in different OMM and divert all the operations to that OMM to avoid the long time interception in the system operations.
NetNumen CM Data Backup and Recovery Context
To backup NetNumen Configuration Management (CM) data from the client and to recover through client. Perform the following steps to take backup Configuration Management data from client and to restore CM data in particular node or OMM through client.
Confidential and Proprietary Information of ZTE CORPORATION
85
NetNumen M31(RAN) Maintenance Guide
Steps
1. Taking CM data backup from NetNumen Client. i.
From the NetNumen Topology Management Client click on Operation > NE Management > Configuration Management as shown in Figure 46. FIGURE 46 NAVIGATING TO CONFIGURATION MANAGEMENT
ii. In Configuration Management click on Management > Data Management > Data Backup as shown in Figure 47 FIGURE 47 NAVIGATING TO DATA BACKUP WINDOW
iii. Selecting destination path for CM data storage :To choose Storage path click Select button from Data Backup window as shown in Figure 48. Open window appears, select CM folder and click Open button as shown in Figure 49. Again click open button as shown in Figure 50.
86
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 48 DATA BACKUP WINDOW
FIGURE 49 SELECTING CM DIRECTORY
Confidential and Proprietary Information of ZTE CORPORATION
87
NetNumen M31(RAN) Maintenance Guide
FIGURE 50 SELECTING PATH FOR CM DATA STORAGE
iv. Enter the file name in Fileme prefix and enter remark in Backup remarks. Select the particular managed element or entire OMM to take backup as shown in Figure 51 and click OK. FIGURE 51 SELECTING NODES FOR CM DATA BACKUP
v. After successful backup system shows the result of backup in Data Backup Result window as shown in Figure 52. Click Close.
88
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 52 DATA BACKUP RESULT WINDOW
vi. Backup data file will be stored in ums-svr/backup/sysm anager/cm path, as shown in Figure 53. FIGURE 53 BACKUP FILE STORAGE LOCATION
2. Recovering NetNumen CM data through client: i.
From Topology Management main window click on Operation > NE Management > Configuration Management as shown in Figure 54.
Confidential and Proprietary Information of ZTE CORPORATION
89
NetNumen M31(RAN) Maintenance Guide
FIGURE 54 NAVIGATING TO CONFIGURATION MANAGEMENT
ii. From Configuration Management window click Management > Data Management > Data Recover. FIGURE 55 NAVIGATING
TO
DATA RECOVER
iii. Data recovery window appears as shown in Figure 56. click on Select button to select the file from path. Open window appears, as shown in Figure 57. Select the file from CM folder and click open button as shown in Figure 58.
90
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 56 DATA RECOVERY WINDOW
FIGURE 57 OPEN WINDOW
Confidential and Proprietary Information of ZTE CORPORATION
91
NetNumen M31(RAN) Maintenance Guide
FIGURE 58 SELECTING CM DATA FILE FROM CM FOLDER
iv. Select the node which is to be recovered and click OK, as shown in Figure 59. FIGURE 59 DATA RECOVERY WINDOW
v. After successful recovery system displays result in Data Recover Result window as shown in Figure 60. To activate the particular site user can click Set activate configset… button to end this operation click Close.
92
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 60 DATA RECOVERY RESULT WINDOW
END OF STEPS
Database Tables Backup and Recovery Short Description
NetNumen system automatically collects specified performance parameters, all the alarms generated by system, users log information and security management information in the form of tables called database tables.During the particular NetNumen system breakdown, user can divert all the faulty NetNumen system operations to some other live NetNumen system quickly by restoring performance management, fault management, log management and security management database tables. This operation reduces recovery time of the system during breakdown.
Context
Perform the following steps to take database tables backup and recovery:
Steps
1. From main menu click View > System Management window as shown in Figure 61.
Confidential and Proprietary Information of ZTE CORPORATION
93
NetNumen M31(RAN) Maintenance Guide
FIGURE 61 NAVIGATING TO SYSTEM MANAGEMENT
2. To access database tables, user need to login to oracle database:To login to database Click DatabaseServer > Database Login from System Management main window, as shown in Figure 62 . FIGURE 62 NAVIGATING TO DATABASE LOGIN
3. Enter the UserName and Password of the oracle database in Database Login window as shown in Figure 63. Click OK.
94
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 63 LOGIN DATABASE
4. System displays successful login message in Message window as shown in Figure 64. FIGURE 64 DATABASE LOGIN SUCCESSFUL MESSAGE
5. After login to database , click DatabaseServer > Table Collection Operations from the System Management window as shown in Figure 65. FIGURE 65 NAVIGATING TO TABLE COLLECTION OPERATION WINODW
6. Table Collection Operations dialog box appears which contains Network Management Performance Database, Net-
Confidential and Proprietary Information of ZTE CORPORATION
95
NetNumen M31(RAN) Maintenance Guide
work Management Fault Database, andNetwork Management Common Database tables, as shown in Figure 66. FIGURE 66 TABLE COLLECTIONS OPERATIONS DIALOG BOX
7. To take backup of the these tables, first select the tables from any of the three database and right-click on it and select Create option from popup menu, as shown in Figure 67. FIGURE 67 NAVIGATING TO TAKE BACKUP OF TABLES
8. In Create Table Collection window click in Name box and enter the file name up to 50 Characters, Characters includes Chinese, English and some allowed symbols, restricted symbols will not display in text box. After entering name, click NextStep button as shown in Figure 68.
96
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 68 CREATE TABLE COLLECTION TASK STEP 1 OF 4
9. In second step of Table collection task window check the Action as Export and also check desired Store Type as shown in Figure 69. User need to choose the location for the export file to store. Click Choose.. button in, Open dialog box appears as shown in Figure 70. Select the corresponding folder to store database tables and click Open button. In Time Filter pane select Filtrate Type from drop down list, based on filtrate type enter TTime and DTime. For example Last 2 Days(s) filtrate type in TTime column enter 2 as shown in Figure 24(T and D means no of months or Days, like 2 Days, 3 months, replace no of days or months in place of T or D). Click Next Step button.
Confidential and Proprietary Information of ZTE CORPORATION
97
NetNumen M31(RAN) Maintenance Guide
FIGURE 69 CREATE TABLE COLLECTION TASK STEP 2 OF 4
FIGURE 70 CHOOSING PATH IN OPEN DIALOG BOX WINDOW
10. If user wants to delete the tables after taking backup, select Clear option in Basic Setting pane of Create Table Collection Task Step 3 of 4 window as shown inFigure 71.
98
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 71 CREATE TABLE COLLECTION TASK STEP 3 OF 4
11. Enter the how many times need report during the particular duration in Period model and Duration panes of Create Table Collection Task Step 4 of 4 window respectively, as shown in Figure 72. FIGURE 72 CREATE TABLE COLLECTION TASK STEP 4 OF 4
12. In Table collection Operations Oracle window shows the Tasks, this task executes automatically as per duration and interval settings. Backup files will be stored in system backup folder. User can also execute the task manually by right-clicking on task and select Manually Execute from popup menu as shown in as shown in Figure 73.
Confidential and Proprietary Information of ZTE CORPORATION
99
NetNumen M31(RAN) Maintenance Guide
FIGURE 73 NAVIGATING TO EXECUTE TABLES MANUALLY WINDOW
13. In Table Collection Manually Execute Step 1 of 3 window edit the file name and click Next Step button, as shown in Figure 74. FIGURE 74 TABLE COLLECTION MANUALLY EXECUTE STEP 1 OF 3
14. In Table Collection Manually Execute Step 2 of 3 window set the option in Basic setting and Time filter pane as explained earlier and click Next Step button, as shown in Figure 75.
100
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 75 TABLE COLLECTION MANUALLY EXECUTE STEP 2 OF 3
15. In Table Collection Manually Execute Step 3 of 3 windows set the option in Basic setting and Time filter pane as explained earlier and click Execute button, as shown in Figure 76. FIGURE 76 TABLE COLLECTION MANUALLY EXECUTE STEP 3 OF 3
16. Before executing the task system asks for confirmation, as shown in Figure 77 and Click OK.
Confidential and Proprietary Information of ZTE CORPORATION
101
NetNumen M31(RAN) Maintenance Guide
FIGURE 77 CONFIRM MESSAGE BEFORE EXECUTING TASK
17. Task will be executed as shown in Figure 78. FIGURE 78 EXECUTE TASK RESULT WINDOW
END OF STEPS
102
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
All Users Database Backup and Recovery Short Description
Context Steps
To backup and recovery of the entire NetNumen’s entire database. All user database backup is commonly used before software upgrade. Perform the following steps take backup of entire database 1. Login into Oracle database through Solaris OS. 2. Enter the following command to take entire database backup $ exp system / oracle @ SID file = a.dmp log = a.log owner = (database user name1, database user name 2, ... ..., the database user name n)
3. Recovering the database: Each time only one user database can be recovered. Enter the following command to recover the entire database. $ imp system / oracle @ SID file = a.dmp log = a.log fromuser = database user name 1 touser = user database 1. $ imp system / oracle @ SID file = a.dmp log = a.log fromuser = database user name 2 touser = user database 2. …………………….. …………………….. $ imp system / oracle @ SID file = a.dmp log = a.log fromuser = database user name n touser = user database n.
END OF STEPS
Fault Management database backup Short Description
Context Steps
In the case of Fault Management database backup, user need to take backup only history alarms and notifications. This backup data will be used by maintenance personnel to find out the reason for the particular alarm and to solve the problem. Perform the following steps to take database backup: 1. To enter into Fault Management system, from main menu click View > Fault Management, as shown in Figure 79.
Confidential and Proprietary Information of ZTE CORPORATION
103
NetNumen M31(RAN) Maintenance Guide
FIGURE 79 NAVIGATING TO FAULT MANAGEMENT
2. To backup alarm, user needs to query, from Fault Management window, select Query > View History Alarms, as shown in Figure 80. FIGURE 80 NAVIGATING TO QUERY ALARMS
3. History Alarm Query Conditions interface pops up, as shown in the Figure 81.
104
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
FIGURE 81 HISTORY ALARM QUERY CONDITION INTERFACE
4. Click button on the History Alarm Query Conditions interface toolbar to open Query History Alarm dialog box, as shown in the Figure 82. Set the query conditions and click on OK to execute the query conditions.
Confidential and Proprietary Information of ZTE CORPORATION
105
NetNumen M31(RAN) Maintenance Guide
FIGURE 82 QUERYING HISTORY ALARM INTERFACE
5. System displays all current alarms that meet the set conditions, as shown in Figure 83 FIGURE 83 HISTORY ALARMS QUERY RESULT
106
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 9 Data Backup and Recovery
6. Click button on real-time current alarms toolbar, Save dialog box appears, in save dialog box user need to choose path for export file, as shown in Figure 84. FIGURE 84 SAVE WINDOW
7. Successful saved message appears, as shown in Figure 85 FIGURE 85 CONFIRMATION MESSAGE
8. To export Notification follow the same steps as History alarms. END OF STEPS
Confidential and Proprietary Information of ZTE CORPORATION
107
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
108
Confidential and Proprietary Information of ZTE CORPORATION
Chapter
10
Emergency Maintenance Table of Contents Emergency Maintenance Purpose ....................................... 109 Principle of Emergency Maintenance................................... 110 Emergency Maintenance Flow............................................ 110 Service Check ................................................................. 111 Fault Record ................................................................... 111 Locating Fault ................................................................. 112 Emergency Recourse........................................................ 112 Service Recovery ............................................................. 113 Information Record .......................................................... 114 Information Collection ...................................................... 114 Emergency Maintenance Tables ......................................... 114
Emergency Maintenance Purpose Emergency fault is a fault that causes the failure in providing basic services, system being unable to work for more than 30 minutes or human safety hazards, or other problems that should be solved in emergencies upon request by the carrier. The main purposes of emergency troubleshooting are as follows: �
Handle the faults as soon as possible.
�
Restore interrupted services as soon as possible, thereby avoiding or reducing the loss caused by the faults.
During NetNumen M31 system running, fault may occur on some parts. When the fault occurs, the maintenance personnel should locate and then handle the fault as soon as possible. When the emergency fault occur, inform ZTE maintenance personnel and prepare the remote maintenance software (such as PCANYWHERE), necessary telephone lines, modem and other maintenance tools.
Confidential and Proprietary Information of ZTE CORPORATION
109
NetNumen M31(RAN) Maintenance Guide
Principle of Emergency Maintenance Restoring interrupted service is the principle of emergency fault handling. Once emergency faults on the equipment are reported or found (Level 1 fault), to minimize damage or loss it is required to clear the faults and restore the system as soon as possible. Handle emergency faults according to the procedure of emergency faults location and analysis. Meanwhile, contact local ZTE office for technical support. According to statistical data, system faults may comprise: complete or partial power failure, network failure, database fault and other faults. It is recommended to conduct troubleshooting in aforesaid key aspects. If it is confirmed that power and communication is OK, one may resort to Alarm Management System to locate the node where problem possibly lies in.
Emergency Maintenance Flow The emergency maintenance flow involves the following steps: 1. 2. 3. 4. 5. 6. 7.
Service check Record abnormalities Make initial location and analysis of faults. Launch the emergency aid. Service recovery Service observation Information record.
The emergency maintenance flow chart is as shown in Figure 86.
110
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 10 Emergency Maintenance
FIGURE 86 EMERGENCY MAINTENANCE FLOW
Service Check Judge if the reported faults are related to this network management system or the lower-level network management. Check if there are abnormalities in the equipment environment (temperature, humidity or power supply). And then the maintenance personnel should record the fault information.
Fault Record Before/During the start of the emergency recovery plan or the fault recovery, make records of the running version and phenomena in the abnormality table Abnormality Record Table. Back up OMC configuration data properly.
Confidential and Proprietary Information of ZTE CORPORATION
111
NetNumen M31(RAN) Maintenance Guide
Note: The abnormality record is very useful in emergency aid and the subsequent problem analysis and summary. Therefore, be sure to fill a complete abnormality record.
Locating Fault After fault information is obtained, analyze and judge fault information to locate the fault. The fault location is the procedure of locating the direct reason from multiple possible causes. That is, exclude impossible factors through the analysis and certain means and methods and comparison of various possible causes, and finally determine the cause of the fault. In terms of equipment, the main fault causes are as follows: �
Hardware faults: including transmission fault, power fault, LAN network fault or component damages.
�
Software faults: including foreground software fault, database software fault or dual-computer management software failure.
It is necessary to analyze and judge the fault causes with the help of maintenance tools of fault management and signaling trace. The maintenance personnel should often locate the fault by exclusive method and give full considerations of onsite conditions.
Emergency Recourse If the interrupted services can not be restored after troubleshooting or some vital system faults happen, then collect the necessary fault information and ask ZTE 24-hour service hotline for help. ZTE provides three emergency recourse channels: 24-hour service hotline, remote technical support and onsite technical support, as shown in Figure 87.
112
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 10 Emergency Maintenance
FIGURE 87 EMERGENCY RECOURSE
1. Service hotline ZTE provides 7*24-hour technical support services. When emergency faults occur, dial ZTE Customer Support Center hotline : 8008301118 or 4008301118 or 0755-26770800 and it is better to provide the record table of onsite abnormality, facilitating ZTE maintenance personnel to learn and then locate the fault. 2. Remote support Based on the information provided by the service hotline, the technical engineer can log in the abnormal office remotely. In case of common faults, customer can solve them through technical engineer guidance. In case of complex faults, technical engineers are assigned to the site to provide onsite technical support. 3. Onsite technical support When the maintenance engineer arrives at the site, they will adopt some necessary maintenance measures to restore the communication as soon as possible.
Service Recovery If the methods provided in this manual and remote emergency aid cannot help to locate faults and recover it is necessary to perform forced handover of dual-computer.
Confidential and Proprietary Information of ZTE CORPORATION
113
NetNumen M31(RAN) Maintenance Guide
Information Record When an emergency fault is handled and the service runs normally, please fill in the troubleshooting record table. If you are asked for emergency service hotline for help, it is necessary to review the result to ZTE local office, enabling us to provide better after-sale services for carriers.
Information Collection When the service is restored, it is necessary to collect data of statistic report, running logs and history alarms and feed back the fault handling results to ZTE Customer Support Center. Then the data are sent to the customer support department.
Emergency Maintenance Tables Abnormality Record Table Table 15 below serves as an example only. Optimize according to actual NetNumen M31 maintenance items. TABLE 15 ABNORMALITY RECORD TABLE Equipment name
Item
Equipment name Equipment No. Abnormality description
Fault occurrence time Fault occurrence scope Serious alarm item reported by OMC
114
Confidential and Proprietary Information of ZTE CORPORATION
Chapter 10 Emergency Maintenance
Equipment name
Equipment name Equipment No.
Operation log information of OMC Project information
For example, the environment of the equipment room: temperature and humidity change. Record it if any.
Equipment Emergency Maintenance Requisition This requisition is to notify ZTE technical support center when the carrier fails to solve the fault on his own. It is better to attach the on-site fault record Table 16 with the fax, to allow ZTE personnel to locate and eliminate the faults more easily. TABLE 16 EQUIPMENT EMERGENCY MAINTENANCE REQUISITION The user should fill in the following fields. Equipment name Complaint time
(HH-MMDD-YY)
Complaint company or organization
Equipment No.
Software version
Complainant
Telephone Whether in the warranty period
()Y()N
Abnormality Record Table (Please attach it on the blank below) : Details of the handling process (as detailed as possible ): Reviewed by: Stamp of the department: ZTE personnel should fill in the following fields: Solution
Time of settlement
O Guide through telephone O Remote maintenance O On-site support
(HH-MM-DD-YY)
Confidential and Proprietary Information of ZTE CORPORATION
115
NetNumen M31(RAN) Maintenance Guide
The user should fill in the following fields. Handling result:
Handled by: Stamp of the department:
Unresolved problems:
Troubleshooting Record Table troubleshooting record table is as shown in Table 17 TABLE 17 TROUBLESHOOTING RECORD TABLE Equipment name Fault occurrence time
Equipment No. (HH-MM-DDYY)
Fault elimination time
(HH-MM-DDYY)
Fault type: hardware equipment fault, power fault, transmission network fault, data modification and other faults. Fault source: user complaint, alarm system, fault found during maintenance and other sources. Fault phenomena: Solution: Summary: Signature of the attendant:
116
Signature of the handling person:
Confidential and Proprietary Information of ZTE CORPORATION
Figures
Figure 1 Inspect Communication Between Network Management Server And Client—I .......................14 Figure 2 Inspect Communication Between Network Management Server And Client-II ........................15 Figure 3 Inspect Communication Between Network Management Server And Client - III .....................16 Figure 4 Inspect Communication Between Network Management Server And Lower Level Network Management (1)................................................17 Figure 5 Inspect Communication Between Network Management Server And Lower Level Network Management (2)................................................17 Figure 6 Check Communication Between Nm Server And Lower Level Nm (3) ..........................................18 Figure 7 Inspect Server Operation Status..............................19 Figure 8 Real Time Alarm Monitor Window ............................21 Figure 9 Showing Alarm Details ...........................................22 Figure 10 Operation Log Window .........................................23 Figure 11 Server Performance Window .................................26 Figure 12 Timer Query Task ................................................31 Figure 13 Check Database Space (1) ...................................32 Figure 14 Checking Database Space (2) ...............................33 Figure 15 Virus Update Information .....................................34 Figure 16 Mcafee Auto Update Dialog Box .............................35 Figure 17 Check Server Operation Status..............................36 Figure 18 Check Server Disk Storage Capacity (1) .................38 Figure 19 Database Login Window ......................................43 Figure 20 View Database Resource(2) .................................44 Figure 21 Check Performance Statistic Result of Operation Network ..........................................................45 Figure 22 About Interface ...................................................47 Figure 23 Memory Monitor ..................................................47 Figure 24 Check Dual-system Switching (1) .........................50 Figure 25 Check Dual-system Switching (2) ..........................51
Confidential and Proprietary Information of ZTE CORPORATION
117
NetNumen M31(RAN) Maintenance Guide
Figure 26 Check Dual-system Switching (3) ..........................51 Figure 27 Check Dual-system Switching (4) ..........................51 Figure 28 Check Dual-system Switching (5) ..........................52 Figure 29 Check Dual-system Switching (6) ..........................52 Figure 30 Table Collection Operation Window.........................54 Figure 31 Query System Log ...............................................54 Figure 32 Log Record .........................................................55 Figure 33 Table Collection Operation Window.........................56 Figure 34 Query System Log ...............................................56 Figure 35 Log Record Window .............................................56 Figure 36 Database Login Window .......................................72 Figure 37 Table Collection Operation Window.........................73 Figure 38 Deploy Default Properties — I ...............................75 Figure 39 Deploy Default Properties —II ...............................75 Figure 40 Navigating To Log Folder ......................................82 Figure 41 After Long Files Deletion From Log Directory ...........82 Figure 42 Changing The Directory In Solaris System .............83 Figure 43 Arching Files In Tar Format ...................................84 Figure 44 Log.Tar File In Ums-Svr Directory .........................84 Figure 45 Zip Log.Tar File ...................................................85 Figure 46 Navigating To Configuration Management ...............86 Figure 47 Navigating To Data Backup Window .......................86 Figure 48 Data Backup Window ...........................................87 Figure 49 Selecting Cm Directory.........................................87 Figure 50 Selecting Path For Cm Data Storage.......................88 Figure 51 Selecting Nodes For Cm Data Backup .....................88 Figure 52 Data Backup Result Window..................................89 Figure 53 Backup File Storage Location.................................89 Figure 54 Navigating To Configuration Management ...............90 Figure 55 Navigating to Data Recover...................................90 Figure 56 Data Recovery Window.........................................91 Figure 57 Open Window .....................................................91 Figure 58 Selecting Cm Data File From Cm Folder ..................92 Figure 59 Data Recovery Window.........................................92 Figure 60 Data Recovery Result Window ...............................93 Figure 61 Navigating To System Management .......................94 Figure 62 Navigating To Database Login ...............................94 Figure 63 Login Database ...................................................95 Figure 64 Database Login Successful Message .......................95 Figure 65 Navigating To Table Collection Operation Winodw .....95
118
Confidential and Proprietary Information of ZTE CORPORATION
Figures
Figure 66 Table Collections Operations Dialog Box ..................96 Figure 67 Navigating To Take Backup Of Tables ......................96 Figure 68 Create Table Collection Task Step 1 Of 4 .................97 Figure 69 Create Table Collection Task Step 2 Of 4 .................98 Figure 70 Choosing Path In Open Dialog Box Window .............98 Figure 71 Create Table Collection Task Step 3 Of 4 .................99 Figure 72 Create Table Collection Task Step 4 Of 4 .................99 Figure 73 Navigating To Execute Tables Manually Window ..... 100 Figure 74 Table Collection Manually Execute Step 1 Of 3 ....... 100 Figure 75 Table Collection Manually Execute Step 2 Of 3 ....... 101 Figure 76 Table Collection Manually Execute Step 3 Of 3 ....... 101 Figure 77 Confirm Message Before Executing Task ............... 102 Figure 78 Execute Task Result Window ............................... 102 Figure 79 Navigating To Fault Management ......................... 104 Figure 80 Navigating To Query Alarms ................................ 104 Figure 81 History Alarm Query Condition Interface ............... 105 Figure 82 Querying History Alarm Interface......................... 106 Figure 83 History Alarms Query Result ............................... 106 Figure 84 Save Window ................................................... 107 Figure 85 Confirmation Message ........................................ 107 Figure 86 Emergency Maintenance Flow.............................. 111 Figure 87 Emergency Recourse.......................................... 113
Confidential and Proprietary Information of ZTE CORPORATION
119
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
120
Confidential and Proprietary Information of ZTE CORPORATION
Tables
Table 1 Chapter Summary .................................................... i Table 3 List of Daily Routine Maintenance Items.....................11 Table 4 Netnumen M31 System Temperature Requirements .....12 Table 5 Netnumen M31 System Temperature Requirements .....13 Table 6 Weekly Routines For Maintenance Checklist ................29 Table 8 Monthly Routine Maintenance Checklist......................41 Table 9 Power Supply Range Of Netnumen M31 System ..........42 Table 10 List of Weekly Maintenance Items ...........................49 Table 11 Daily Maintenance Record Form .............................59 Table 12 Weekly Maintenance Record Form ...........................60 Table 13 Monthly Maintenance Record Form ..........................62 Table 14 Quarterly Maintenance Record Form ........................63 Table 15 Abnormality Record Table..................................... 114 Table 16 Equipment Emergency Maintenance Requisition ...... 115 Table 17 Troubleshooting Record Table................................ 116
Confidential and Proprietary Information of ZTE CORPORATION
121
NetNumen M31(RAN) Maintenance Guide
This page is intentionally blank.
122
Confidential and Proprietary Information of ZTE CORPORATION
Glossary BSS - Base Station Subsystem CPU - Central Processing Unit FTP - File Transfer Protocol IP - Internet Protocol LAN - Local Area Network MO - Management Object NAT - Network Address Translation NM - Network Management NMS - Network Management System OMC - Operation & Maintenance Center OMM - Operation & Maintenance Module PCB - Printed Circuit Board PGND - Protection Ground RAN - Radio Access Network SMB - Server Message Block VCS - Veritas Cluster Server
Confidential and Proprietary Information of ZTE CORPORATION
123