Huawei Optical Network Maintenance Reference - WDM ASON
Issue
02
Date
2014-08-26
Huawei Technologies Co., Ltd.
Huawei Technologies Co., Ltd. provides customers with comprehensive technical support and service. Contact our local office or company headquarters.
Huawei Technologies Co., Ltd. Address:
Huawei Industrial Base Bantian, Longgang Shenzhen 518129 People's Republic of China
Website:
http://www.huawei.com
Email:
[email protected]
Copyright © Huawei Technologies Co., Ltd. 2011. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, expressed or implied.
Acknowledgement This document is prepared and reviewed by the ASON R&D Maintenance Team, Information Development Dept, Customer Support Dept, and Technical Support Dept together. Editor: Feng Junjie, Zhu Fei, Feng Haoyu, Wang Chaokai, Feng Chao, Liu Yuan, Zhang Meng, Zhou Yuxing, Zheng Fan, Li Weiping Others: ASON R&D Maintenance Team: Jiang Yi, Bai Zhongqiang, Li Qingsong Information Development Dept: Fan Xiaoke, Pei Xin Technical Support Dept: Dou Yongtan, Xie Bing Customer Support Dept: Zhang Junguang, Fu Ming, Ma Qingquan Quality Assurance Dept: Xue Xiuhua Special acknowledgements to Jin Yuzhi, Mu Jianhong, Feng Zhigang, Wu Gang, Niu Shouchang, and Chen Bin
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
i
Huawei Optical Network Maintenance Reference WDM ASON
About this Document
About this Document Overview This document mainly covers four aspects: preventive maintenance, quick service recovery guide, fault diagnosis, and maintenance cases. It consists of the following chapters:
SOP for Maintaining NG WDM ASON Devices This chapter provides the inspection items and operation methods for preventive maintenance of NG WDM ASON devices. The standard operating procedure (SOP) helps maintenance personnel discover and eliminate network risks during preventive maintenance, therefore ensuring network stability and security.
Quick Recovery Guide for NG WDM ASON Services This chapter provides handling measures for quickly recovering ASON services on NG WDM devices.
Fault Diagnosis for ASON Service Interruptions on NG WDM Devices This chapter provides major methods and measures for diagnosing ASON service faults on NG WDM devices.
Typical ASON Service Troubleshooting Cases This chapter provides the typical ASON service troubleshooting cases for reference to guide maintenance personnel through troubleshooting of NG WDM devices.
FAQs This chapter provides frequently asked questions about NG WDM ASON operation and maintenance.
Change History Issue
Date
Description
02
2011-08-26
Added “Precautions” section in “1 SOP for Maintaining NG WDM ASON Devices”.
01
2011-09-29
This issue is the first official release.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
ii
Huawei Optical Network Maintenance Reference WDM ASON
Contents
Contents About this Document ..................................................................................................................... ii 1 SOP for Maintaining NG WDM ASON Devices.................................................................... 1 1.1 Introduction to SOP for ASON Maintenance ................................................................................................... 2 1.2 ASON Device SOP Checklist ......................................................................................................................... 12
2 Quick Recovery Guide for NG WDM ASON Services ....................................................... 13 2.1 Overview ........................................................................................................................................................ 13 2.2 ASON Service Recovery Process ................................................................................................................... 13 2.2.1 Intended Audience ................................................................................................................................ 13 2.2.2 Standard Operations for Routine Maintenance ..................................................................................... 13 2.2.3 Classification of Fault Causes ............................................................................................................... 14 2.2.4 Rerouting and Non-rerouting Conditions .............................................................................................. 14 2.2.5 ASON Alarms ....................................................................................................................................... 15 2.2.6 General Principles for Recovering ASON Services .............................................................................. 16 2.2.7 Key Information for Recovering ASON Services ................................................................................. 16 2.2.8 General Flowchart for ASON Service Recovery .................................................................................. 17 2.3 AQuick Recovery Process for ASON Services .............................................................................................. 17 2.3.1 Quick Recovery Process for ASON Services (Optical-Layer Service) ................................................. 17 2.3.2 Quick Recovery Process for ASON Services (OTN Electrical-Layer Service) .................................... 20 2.3.3 Quick Recovery Process for ASON Services (VC-4 Services) ............................................................. 24 2.4 Troubleshooting Processes for Typical ASON Service Faults ........................................................................ 25 2.4.1 Troubleshooting Process for Route Computation Failures .................................................................... 25 2.4.2 Troubleshooting Process for Rerouting Failures ................................................................................... 27 2.4.3 Troubleshooting Process for Unreachable NEs ..................................................................................... 28 2.4.4 Troubleshooting Process for Interruption of a Single-wavelength length ASON Service .................... 30 2.4.5 Troubleshooting Process for Interruption of Multi-Wavelength Services ............................................. 31 2.4.6 Troubleshooting Process for Malfunctioning SCC Boards ................................................................... 33 2.4.7 Troubleshooting Process for a Failure to Recover Interrupted ASON Services Because of Add/Drop Channel Faults ............................................................................................................................................... 34 2.5 Appendix: ASON Node Troubleshooting Process .......................................................................................... 35 2.5.1 General Principles for Handling ASON Node Faults ............................................................................ 35 2.5.2 Fault Description ................................................................................................................................... 35 2.5.3 Procedure for Restoring Databases Using the Real-Time Database Backup Function ......................... 35
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
iii
Huawei Optical Network Maintenance Reference WDM ASON
Contents
2.5.4 Restoring the Configurations Manually ................................................................................................ 38 2.5.5 Restoration Process If No Backup Database Is Available ..................................................................... 40
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices ....................... 41 3.1 General Fault Diagnosing Procedure ............................................................................................................. 41 3.1.1 Overview ............................................................................................................................................... 41 3.1.2 Flowchart .............................................................................................................................................. 42 3.1.3 Troubleshooting Procedure ................................................................................................................... 42 3.1.4 Exception Handling Process ................................................................................................................. 46 3.2 Diagnosing the Fault that a Service Is Unavailable After Being Successfully Rerouted ............................... 46 3.2.1 Overview ............................................................................................................................................... 46 3.2.2 Flowchart .............................................................................................................................................. 46 3.2.3 Troubleshooting Procedure ................................................................................................................... 47 3.2.4 Exception Handling Process ................................................................................................................. 48 3.3 Diagnosing the Fault that Service Rerouting Fails ......................................................................................... 48 3.3.1 Overview ............................................................................................................................................... 48 3.3.2 Flowchart .............................................................................................................................................. 49 3.3.3 Troubleshooting Procedure ................................................................................................................... 49 3.3.4 Exception Handling Process ................................................................................................................. 50 3.4 Diagnosing the Fault that a Service Is Not Rerouted ..................................................................................... 50 3.4.1 Overview ............................................................................................................................................... 50 3.4.2 Flowchart .............................................................................................................................................. 51 3.4.3 Troubleshooting Procedure ................................................................................................................... 51 3.4.4 Exception Handling Process ................................................................................................................. 52 3.5 Common Operations Involved in Fault Diagnosis ......................................................................................... 52 3.5.1 Diagnosing OPA Adjust Failures ........................................................................................................... 52 3.5.2 Diagnosing Incorrect Attenuation Delivery of OPA ............................................................................. 54 3.5.3 Confirming Information About a Faulty Service .................................................................................. 55 3.5.4 Obtaining Performance Data ................................................................................................................. 57 3.6 Information to Be Collected ........................................................................................................................... 57 3.7 References ...................................................................................................................................................... 58
4 Typical ASON Service Troubleshooting Cases .................................................................... 59 4.1 ASON-Specific Operations and Configurations ............................................................................................. 59 4.1.1 Case 1: Creating an ASON Service Fails Because the Wavelength for the Service Is Reserved .......... 59 4.1.2 Case 2: ASON NEs Reset Because of a Duplicated OSPF IP Address ................................................. 61 4.1.3 Case 3: An Attempt to Start the TE Link Management Window Times Out ......................................... 61 4.1.4 Case 4: Service Protection Level Fails to Change After a Line Board Participating in the Protection Is Moved to Another Slot ................................................................................................................................... 63 4.1.5 Case 5: Deleting Inter-NE Fiber Connections Fails When Separate Optical and Electrical NEs Are Configured ..................................................................................................................................................... 65 4.1.6 Case 6: Route Computing Fails During Creation, Optimization, or Rerouting of an ASON Service or During an Upgrade of a Static Service to an ASON Service ......................................................................... 66
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
iv
Huawei Optical Network Maintenance Reference WDM ASON
Contents
4.1.7 Case 7: A Newly Created ASON Service Fails to Traverse a Node or an Existing ASON Service Fails to Traverse a Node During Trail Optimization or Rerouting ......................................................................... 68 4.1.8 Case 8: LMP Protocol Check Fails Due to DCN Errors and Consequently Service Deployment Fails 69 4.1.9 Case 9: ASON Services Fail to Be Deployed Because Line Attenuation Is Excessively High ............. 71 4.1.10 Case 10: An Error Message Is Displayed When Users Attempt to Create Virtual TE Links .............. 72 4.1.11 Case 11: Route Computation Fails After Explicit Resources Are Specified to Create or Optimize an ASON Service ................................................................................................................................................ 75 4.2 ASON Service Restoration ............................................................................................................................. 77 4.2.1 Case 1: An ASON Service Is Interrupted Because OPA Fails ............................................................... 77 4.2.2 Case 2: An ASON Service Is Interrupted Because Protection Switching Fails After a Second Fiber Cut ....................................................................................................................................................................... 78 4.2.3 Case 3: An ASON Service Is Interrupted After Being Rerouted Because of Incorrect Fiber Connections ....................................................................................................................................................................... 79 4.2.4 Case 4: An ASON OCh Trail in a Slave Subrack Is Interrupted but Not Rerouted After the Slave Subrack Is Powered Off ................................................................................................................................. 81 4.2.5 Case 5: An ASON Service Is Frequently Rerouted Among Multiple Trails.......................................... 82 4.2.6 Case 6: An Optical ASON Service Fails to Be Automatically Restored in Case of a Wavelength-Level Fault ............................................................................................................................................................... 83 4.2.7 Case 7: An ASON Service Enabled with Scheduled Reversion Fails to Be Reverted to Its Original Trail After the Scheduled Reversion Time Elapses ........................................................................................ 84 4.2.8 Case 8: Route Computation Fails When an ASON OCh Service Traverses a Regeneration Board ...... 84 4.2.9 Case 9: An ASON OCh Service Is Interrupted but Not Rerouted ......................................................... 86
5 FAQs .............................................................................................................................................. 90 5.1 Operations on the NMS .................................................................................................................................. 90 5.1.1 How to Distinguish Between ASON Services and Traditional Services on the NMS? ......................... 90 5.1.2 How to Identify the First and Last Nodes of an ASON Service? .......................................................... 91 5.1.3 How to Identify the Original Trail and Preset Restoration Trail of an ASON Service? ........................ 92 5.1.4 How to Manually Optimize an ASON Service? .................................................................................... 93 5.1.5 How to Manually Optimize ASON Services in Batches? ..................................................................... 94 5.1.6 How to Change a Wavelength to Optimize Trails? ............................................................................... 95 5.1.7 How to Obtain Fiber Connections of Boards that an ASON Service Traverses? .................................. 95 5.1.8 How to Quickly Locate the Board Where an OCH_SER_INT Alarm Is Generated? ........................... 95 5.1.9 How to Quickly Query the Current Trail or Preset Restoration Trail of an ASON Service that Traverses a Specific NE or Board? ................................................................................................................................ 96 5.1.10 How to Check Whether a Preset Restoration Trail Is Available? ........................................................ 96 5.1.11 How to Check Whether an Optical Cross-Connection Is Successfully Created for an ASON Service? ....................................................................................................................................................................... 96 5.1.12 How to Quickly Restore an Interrupted Service? ................................................................................ 96 5.1.13 How to Quickly Create Fiber Connections Between Sites? ................................................................ 97 5.2 Configuration Rules ....................................................................................................................................... 97 5.2.1 What Are Common Attributes and Recommended Configurations for ASON Services? ..................... 97 5.2.2 What Are the Risks if ODU0, ODU1, and ODU2 ASON Services Are Concurrently Configured? ..... 97 5.2.3 What Are the Basic Rules for Configuring Preset Restoration Trails? .................................................. 97
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
v
Huawei Optical Network Maintenance Reference WDM ASON
Contents
5.2.4 What Are the Recommended Configurations for the Preset Restoration Trails and Revertive Attributes of SDH/OTN/WDM ASON Services? ........................................................................................................... 98 5.2.5 What Are the Restrictions on Regeneration Boards When Optical NEs and Electrical NEs Are Separated and Why?....................................................................................................................................... 98 5.2.6 How to Ensure the Rerouting Function When No OA Board Is Configured Between an FIU Board and a WSS Board? ................................................................................................................................................ 99 5.2.7 Why Are TN52SCC Boards Instead of TN51SCC or TN11SCC Boards Recommended for ASON NEs? ....................................................................................................................................................................... 99 5.2.8 What Are the Rules for Configuring Node IDs for ASON NEs? .......................................................... 99 5.2.9 Why Must the Node ID and IP Address of an NE Be in Different Network Segments? ..................... 100 5.2.10 Why Does the LMP or OSPF Protocol Need to Be Disabled on Electrical Links on OTU Boards Adding or Dropping WDM ASON Services? .............................................................................................. 100 5.2.11 What Are the Application Scenarios and Configuration Method for Resource Reservation? ........... 102 5.2.12 Why Does the LMP Protocol Need to Be Disabled for Optical Ports on Tributary Boards? ............ 103 5.2.13 How to Disable the LMP Protocol for the Optical Ports that Are Not Used by ASON Services? .... 103 5.2.14 How to Disable the OSPF Protocol for the Optical Ports that Are Not Used by ASON Services? ... 103 5.2.15 How to Split a Large DCN Subnet into Smaller DCN Subnets? ....................................................... 104 5.2.16 Do Diamond, Gold, Silver, and Copper ASON Services Support Hitless Conversion? ................... 105 5.3 ASON Principles .......................................................................................................................................... 105 5.3.1 What Overheads Are Used by Control Channels on the Control Plane? ............................................. 105 5.3.2 What Is the Difference Between the Menu Items "Revert To Port" and "Revert to Channel"? .......... 105 5.3.3 What Is the Difference Between Trail Overlap and Trail Sharing? ..................................................... 106 5.3.4 What Is Associated Sharing? ............................................................................................................... 106 5.3.5 What Is the Relationship Between SRLGs and Associated Services? ................................................ 106 5.3.6 Why Is a Revertive Service Reverted to the Original Trail 5 Minutes After Rerouting and How to Revert the Service to the Original Trail Within 5 Minutes? ......................................................................... 106 5.3.7 Does an OPA Adjust Failure Affect Rerouting of ASON Services? .................................................... 106 5.3.8 Why Cannot Revertive Services Be Downgraded After Rerouting? ................................................... 107 5.3.9 What Is the Difference Between the Function of Downgrading ASON Services in an NE Explorer and the Function of Downgrading ASON Services in the Trail Management Window? .................................... 107 5.3.10 Do Service Optimization Must Be Performed at the First Node? ..................................................... 107 5.3.11 Why Do Services Fail to Be Reverted to the Original Trail? ............................................................ 107 5.3.12 Why Does Synchronization Between the NE and NMS Need to Be Performed During Each Query of ASON Service Information? ........................................................................................................................ 108 5.3.13 Does a CPW_XXX_INT Alarm on the Control Plane Mean Service Interruption? ......................... 108 5.3.14 What Is the Difference in Database Backup and Restoration Between an ASON Network and a Non-ASON Network? .................................................................................................................................. 108 5.3.15 Why Do I Need to Periodically Check Whether Rerouted ASON Services Are Reverted to the Original Trails? ............................................................................................................................................ 108 5.3.16 What Are Residual Cross-Connections and How to Delete Residual Cross-Connections? .............. 108 5.3.17 What Do the CPW_OCH_SER_INT and CPW_ODUk_SER_INT Alarms Mean?.......................... 109
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
vi
Huawei Optical Network Maintenance Reference WDM ASON
1
1 SOP for Maintaining NG WDM ASON Devices
SOP for Maintaining NG WDM ASON Devices
Precautions
ASON networks must be appropriately planned and designed so that the network is robust and services can survive against multiple fiber cuts. For the detailed process, see Figure 1-1.
The expansion of links, subracks, wavelengths, or ODUk channels of an ASON network must be planned, designed, and deployed according to the related creation procedure (see Figure 1-1). Otherwise, the robustness of new services cannot be ensured, and even the security of the existing services may be affected.
Normal ASON network running and service security depend on meticulous routine maintenance and periodical comprehensive assessment and optimization.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
1
Huawei Optical Network Maintenance Reference WDM ASON
1 SOP for Maintaining NG WDM ASON Devices
Figure 1-1 ASON network delivery and maintenance process
Purpose This chapter provides the SOP for preventive maintenance personnel so that they can discover and eliminate network risks, thereby ensuring network stability and security.
Intended Audience System maintenance personnel
Application Device maintenance personnel perform the standard operations and activities for preventive maintenance at suggested intervals.
1.1 Introduction to SOP for ASON Maintenance Table 1-1 lists the SOP for preventive maintenance of ASON.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
2
Huawei Optical Network Maintenance Reference WDM ASON
1 SOP for Maintaining NG WDM ASON Devices
Table 1-1 SOP for preventive maintenance of ASON Preventive Maintenance Item Item
Sub-Item
ASON databases
Checking backup databases for ASON NEs
Time Require d
Frequenc y
Inspection Method
Procedure
Priority
Purpose
5 min/NE
NE databases must be backed up once network changes occur.
Manual
1. Verify that the NMS can successfully back up NE databases at specified intervals.
Minor
To ensure that users can download the NE database to recover the node once a node fault occurs on the network.
Major
To ensure that control links are in normal state. If the control link topology is incorrect, you may not be able to create, optimize,
It is recommended that the NMS start backing up databases for network-wide NEs at 2 a.m. every day. 2. After a network change occurs, for example, many services are deployed or rerouting occurs, request the customer to arrange a time window for manually backing up debases for all ASON NEs on the network.
ASON resources
Checking the status of control links
Issue 02 (2014-08-26)
5 min/100 links
Monthly
Manual
1. On the NMS client, navigate to the ASON control link management window and synchronize the control link information network-wide.
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
3
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
Sub-Item
Time Require d
Frequenc y
1 SOP for Maintaining NG WDM ASON Devices
Inspection Method
Procedure
2. Check whether isolated NEs are present in the control link topology view. If there are isolated NEs, pinpoint the cause and restore the NEs to normal state. In addition, check for alarms on the isolated NEs and clear them one by one.
Priority
Purpose
delete, or reroute an ASON service.
3. Check whether abnormal alarms are generated on the control links. If there are abnormal alarms, locate the boards that report the alarms and clear the alarms one by one. 4. Export the control link information into an excel file. Compare the control link information with this information next time you perform the preventive maintenance, and check
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
4
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
Sub-Item
Time Require d
Frequenc y
1 SOP for Maintaining NG WDM ASON Devices
Inspection Method
Procedure
Priority
Purpose
Major
To ensure that virtual TE links are available.
Major
To ensure that TE links are up. If alarms are generated on TE links, recovery of ASON services
whether the control links are the same in the two inspections. Checking virtual TE links
2 min/one link
Monthly
Manual
1. On the NMS client, navigate to the ASON TE link management window and synchronize the TE link information network-wide. 2. Check the value of Extend Type of TE links. If the value is not Automaticall y Discovered, check for alarms on the source and sink boards of the TE links. If alarms have been generated on the source and sink boards, clear the alarms according to the NMS online help.
Checking status of TE links
Issue 02 (2014-08-26)
5 min/100 links
Monthly
Manual
1. On the NMS client, navigate to the ASON TE link management window and synchronize the TE link information
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
5
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
Sub-Item
Time Require d
Frequenc y
1 SOP for Maintaining NG WDM ASON Devices
Inspection Method
Procedure
network-wide. 2. Check whether abnormal alarms are generated on the TE links. If there are abnormal alarms, locate the boards that report the alarms and clear the alarms one by one.
Priority
Purpose
will be affected.
3. Check the value of Link Status of the TE links. If the value is not Up, check for alarms on the boards where the TE links are down. Then clear the alarms. 4. Export the TE link information into an excel file. Compare the control link information with this information next time you perform the preventive maintenance, and check whether the TE links are the same in the two inspections.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
6
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
ASON services
1 SOP for Maintaining NG WDM ASON Devices
Time Require d
Frequenc y
Inspection Method
Procedure
Priority
Purpose
Checking for residual cross-conn ections
5min/one cross-con nection
Monthly
Manual
Check whether a CPW_XXXX _TEL_PATH MIS alarm is generated throughout the network. If there is a CPW_XXXX _TEL_PATH MIS alarm, clear it according to the NMS online help.
Major
To ensure that there are no residual cross-conne ctions. The CPW_XXX X_TEL_PA THMIS alarm may affect recovery of ASON services.
Checking ASON services
10 min
Daily
Manual
1. On the NMS client, navigate to the ASON trail management window; then synchronize ASON trail information.
Critical
To ensure that ASON services are normal.
Sub-Item
/network
2. Check whether ASON trails are activated. If an ASON trail is displayed as Inactive, check whether a client service is sent to the ASON trail. If no client service is sent to the ASON trail, no further action is required. If a client service is sent to the ASON trail, contact the
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
7
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
Sub-Item
Time Require d
Frequenc y
1 SOP for Maintaining NG WDM ASON Devices
Inspection Method
Procedure
Priority
Purpose
Major
To ensure that there is no control plane alarm. A control plane alarm may disable ASON services from running
customer for further confirmation and take records of the confirmation result. 3. Check whether rerouting lockout is disabled for ASON trails. If it is disabled, contact the customer to check why it is disabled and take records. 4. Check whether alarms are generated on the ASON trails. If yes, clear them according to the NMS online help. 5. Export the ASON information into an Excel file and save it for future reference. Alarms
Clearing control plane alarms
Issue 02 (2014-08-26)
5 min/one alarm
Monthly
Manual
Check whether an alarm starting with "CP" or "CPW" is generated throughout the network. If there is such an alarm, clear it
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
8
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
Sub-Item
Time Require d
Frequenc y
1 SOP for Maintaining NG WDM ASON Devices
Inspection Method
Procedure
according to the NMS online help.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
Priority
Purpose
normally and may even directly affect ASON trails.
9
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item Item
Sub-Item
ASON events
Checking for abnormal ASON events
1 SOP for Maintaining NG WDM ASON Devices
Time Require d
Frequenc y
Inspection Method
Procedure
Priority
Purpose
20 min/50 ASON services
Monthly
Manual
1. Check whether an ASON service rerouting failure has occurred lately. (To verify this information, click new events icon in the main topology of the NMS, then browse the events in the Browse Events Logs –[New Events] window.)
Critical
To monitor network operation. If an ASON rerouting or re-creation failure is reported frequently during a time period, identify the cause and take correspondi ng measures.
2. Check whether an ASON service re-creation failure has occurred lately. (To verify this information, click new events icon in the main topology of the NMS, then browse the events in the Browse Events Logs –[New Events] window.)
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
10
Huawei Optical Network Maintenance Reference WDM ASON
Preventive Maintenance Item
1 SOP for Maintaining NG WDM ASON Devices
Time Require d
Frequenc y
Inspection Method
Procedure
Priority
Purpose
Monthly
Manual
Download pre-warning notices from the http://support. huawei.com website and perform the workarounds, preventive measures, or solutions provided in the notices.
Major
To remove potential risks according to officially released pre-warning notices.
Monthly
Tool + manual
1. Perform preventive maintenance inspection (PMI) of ASON networks using a PMI tool (download the latest tool from the http://support. huawei.com website) and provide the PMI result to Huawei HQ for filing.
Major
To ensure that the ASON resources and services on an ASON NE are in good condition.
Item
Sub-Item
Inspection of potential risks according to pre-warning notices
Checking potential risks based on pre-warni ng notices
30 min
Preventive maintenanc e inspection
Preventive maintenan ce inspection
60 min/NE
/network
2. Analyze the PMI result according to the PMI guide. If there are any problems, rectify them immediately.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
11
Huawei Optical Network Maintenance Reference WDM ASON
1 SOP for Maintaining NG WDM ASON Devices
1.2 ASON Device SOP Checklist Site: ____________
Version: ___________ Maintenance Owner: __________ Time: _____________
PMI Item
PMI Sub-Item
Result
ASON databases
Checking backup databases for ASON NEs
□ OK □ NOK □ POK
ASON resources
Checking the status of control links
□ OK □ NOK □ POK
Checking virtual TE links
□ OK □ NOK □ POK
Checking status of TE links
□ OK □ NOK □ POK
Checking for residual cross-connections
□ OK □ NOK □ POK
ASON services
Checking ASON services
□ OK □ NOK □ POK
Alarms
Clearing control plane alarms
□ OK □ NOK □ POK
ASON events
Checking for abnormal ASON events
□ OK □ NOK □ POK
Checking potential risks based on pre-warning notices
Checking potential risks based on pre-warning notices
□ OK □ NOK □ POK
Preventive maintenance inspection
Preventive maintenance inspection
□ OK □ NOK □ POK
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
Remarks
12
Huawei Optical Network Maintenance Reference WDM ASON
2
2 Quick Recovery Guide for NG WDM ASON Services
Quick Recovery Guide for NG WDM ASON Services
2.1 Overview This chapter provides guidelines for recovering ASON services from service interruption or node faults on an ASON network by assuming that the ASON network has been in operation, aiming to guide the network maintenance personnel through the service recovery and fault diagnosis processes. This chapter assumes that the network maintenance personnel are skilled in fault diagnosis for traditional WDM services on transport equipment and therefore focuses on the differences for maintaining the ASON-capable WDM transport equipment and the traditional WDM transport equipment.
2.2 ASON Service Recovery Process 2.2.1 Intended Audience This chapter is intended for the following personnel:
Customer personnel: members of the operator's maintenance team. They are responsible for routine maintenance of the ASON and handling of common faults.
Customer service personnel: members of Huawei's GTS team. They provide technical support for the customer, and assist the customer in handling network faults.
R&D personnel: members of Huawei's R&D team. They assist the customer service personnel in handling network faults.
2.2.2 Standard Operations for Routine Maintenance To ensure high survivability of the ASON, find the potential faults and problems occurring on the ASON as early as possible, and promptly recover ASON services from faults, the network maintenance personnel must maintain the ASON in a routine manner according to Routine Maintenance Operation Guide. Particularly, they must at least take the following actions: 1.
Perform preventive maintenance regularly using the SmartKit Inspector.
2.
Back up the databases regularly.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
13
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
3.
Back up the NMS data regularly.
4.
Check the discrete services regularly.
5.
Create records for VC12 tunnel services regularly.
6.
Create records for software and hardware versions of each network node.
2.2.3 Classification of Fault Causes The causes of ASON service interruption can be classified as fiber cuts, service board faults, system communication and control (SCC) board faults, configuration errors, and software bugs.
Fiber cuts Fiber cuts may result in shortage of network resources, which will then cause service interruption. If services are interrupted due to shortage of network resources, they cannot be recovered through rerouting. To recover the services, users must repair the fibers first.
Service board faults If services are interrupted due to faults on service add/drop boards at the source or sink NEs, they also cannot be recovered through rerouting. To recover the services, users must first replace the faulty boards.
SCC board faults A fault on an SCC board itself does not cause a service interruption, but causes the ASON control plane to fail. However, if a fiber cut or node power failure occurs while the SCC board is faulty, services will be interrupted and cannot be recovered automatically.
Software bugs (software system faults) If services are interrupted due to a bug in the ASON software, automatic protection for the services cannot be achieved through rerouting even though there are sufficient resources. This is because the rerouting function fails in case of a bug in the ASON software.
Configuration errors A configuration error, such as incorrect clock configurations and port attributes, can also lead to an interruption of ASON services.
2.2.4 Rerouting and Non-rerouting Conditions The following table lists the conditions that trigger rerouting and the conditions that do not trigger rerouting.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
14
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
Domain
Rerouting Occurs When
No Rerouting Occurs When
OCH optical layer
MUT_LOS or BD_STATUS alarms are generated on FIU boards.
LOS alarms are generated only on SC2 boards.
MUT_LOS alarms are generated only on OA boards.
LOS, SSF, DEG, or EXC alarms are generated only on OTU boards.
The service is locked.
No other trails are available (the fiber is broken) for the affected service.
TE link check fails (for example, the TE link is degraded or broken).
There are discrete cross-connections.
The control link is unreachable and a CL DOWN alarm is reported for the route.
Note: After a service is rerouted, it will not be rerouted again in case of a channel alarm (an alarm on an OA or OTU board).
2.2.5 ASON Alarms Among all ASON alarms, the following alarms have direct impact on ASON services: 1.
Service interruption alarms: CPW_OCH_SER_INT, and CPW_ODUk_SER_INT
2.
Link down alarms: CPW_OCH_TEL_DOWN and CPW_ODUk_TEL_DOWN
An OTU board reports an R_LOS, OTUk_SSF, DEG, or EXC alarm or an FIU board reports a MUT_LOS or BD_STATUS alarm when one or multiple services are abnormal. The following ASON alarms have no direct impact on ASON services: 3.
Link control alarms: CPC_CC_DOWN, CPC_OSPF_CL_DOWN, and CPW_OMS_TEL_DEG These ASON alarms are generated when an SC2 board reports an R_LOS alarm or the RES overhead of an OTU board is unavailable. When the alarms are generated, verification of control links or TE links will fail.
4.
Resource alarm: CPW_XXX_TEL_PATHMIS The alarm is usually generated when there are discrete cross-connections. In other words, the alarm is generated when a cross-connection is configured only at one end for a fiber link between two sites or the cross-connection at one end is in use while the cross-connection at the other end is not in use.
5.
Issue 02 (2014-08-26)
The root causes for other ASON alarms (for example, CPW_CLNT_SER_NOTOR, CPW_OCH_SER_SLADEG, CPC_NODE_ID_CONFLICT, and CPC_RSVP_NB_DOWN) are not related to traditional alarms, nor have direct relationship with interruption of ASON services. For details about the causes and methods for handling them, refer to the alarm reference manual.
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
15
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
2.2.6 General Principles for Recovering ASON Services The symptoms and causes for interruption of different ASON services vary greatly. To reduce the downtime and promptly restore an interrupted service, the following steps are generally used: 1.
Locate the interrupted service, and optimize the service onto the preset restoration trail or revert the service back to the original trail (or switch the service to the protection trail) after confirming the trail is available.
2.
If the preceding step fails, deactivate the service. Then create the service as a traditional service.
3.
Check for traditional alarms that have caused the service interruption, and clear them. The preceding steps are three essential actions for promptly recovering ASON services. For the detailed service recovery procedure, see the next section. To determine whether a trail is available, users can check whether an end-to-end trail is available based on the network topology, fiber interruption symptoms, ASON topology (TE link and control link), and service source or sink node. NOTE
While restoring an interrupted service, confirm the following information:
Check whether any configurations are changed before or after the service is interrupted. These configurations include protocol parameter settings (OSPF, LMP, and RSVP parameters) that affect the functions of the control plane, port attribute settings (port loopback, FEC/AFEC mode, and port rate mode) that are related to the service, and service configurations (for example, service trail configurations on the client side).
Check whether there is any service-affecting major alarm after the service is interrupted. For example, check whether there is a new hardware damage alarm (HARD_BAD), fiber break alarm (R_LOS and MUT_LOS), traditional service interruption alarm (LOF, and AIS), or ASON service interruption (XXX_SER_INT).
2.2.7 Key Information for Recovering ASON Services After an ASON service is interrupted, troubleshooting personnel must obtain the key information about the service fault, because the key information is very important for preliminary fault diagnosis and prompt service recovery. The key information includes: 1.
When the ASON service is interrupted? Method: The troubleshooting personnel can determine the time either from the customer feedback or by checking the alarm occurring time.
2.
Whether a fiber is broken on the link for the interrupted service? Method: In the main topology or signal flow diagram of the NMS, check whether a fiber is broken. If the fiber is red, then the fiber has been broken.
3.
What alarms (ASON or traditional alarm) are reported for the interrupted service? Method: Check whether there are any new major alarms.
4.
Does the fault occur in an ASON or traditional service? Method: Check for service alarms directly in the WDM ASON Trail Management and WDM Trail Management windows of the NMS. The trail for the interrupted service is red.
5.
If multiple services are interrupted, do they all carry client services? Which services are the most important? Are all interrupted services in the same direction? Method: Directly confirm this information with the customer.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
16
Huawei Optical Network Maintenance Reference WDM ASON
6.
2 Quick Recovery Guide for NG WDM ASON Services
Which services are interrupted (source/sink nodes and boards)? What are the service protection levels? Which trails does the service traverse? Do the interrupted services pass any regenerators? Method: View the information directly in the WDM ASON Trail Management window. Click Original trail and Current trail in the lower left part of the window, and take the screenshots for the required information.
7.
Has rerouting been triggered for the interrupted service? Has the rerouting succeeded or failed? Method: Directly click Event to view the detailed event information (alternatively, select the specific event screenshot).
8.
Has any network configurations been modified, has any services been added or deleted, or has service cutover been performed before the fault occurs? Method: Directly confirm this information with the customer or maintenance personnel.
2.2.8 General Flowchart for ASON Service Recovery Figure 2-1 General flowchart for ASON service recovery
2.3 AQuick Recovery Process for ASON Services 2.3.1 Quick Recovery Process for ASON Services (Optical-Layer Service) Step 1 Locate the trail for interrupted services.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
17
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
1.
According to the customer feedback, determine when service interruption occurs, how many services are interrupted in addition to the source and sink nodes and fiber status (for example, fiber break) of the network.
2.
Choose Configuration > WDM ASON > WDM Trail Management to display the WDM ASON trail management window. Identify the interrupted ASON service trail according to the service source or sink node information and ASON service alarms. If the CPW_OCH_SER_INT alarm is generated, clear the alarm to recover the services.
3.
If no ASON service interruption alarm is reported, click in the lower right of the Browse Current Alarms window. After alarm synchronization is complete, check whether an ASON service interruption alarm is reported. If there is such an alarm, clear the alarm to recover the services.
4.
If no service interruption alarm is reported after alarm synchronization is performed, filter the interrupted services according to their source or sink nodes. Check whether
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
18
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
there are inactive services. If there are inactive services, check whether they are the faulty services. If yes, activate the services to recover them.
5.
If no ASON service interruption alarm has ever been reported, check for traditional alarms that are generated on the source or sink NE during the service interruption time period as specified by the customer. Then determine the affected boards and corresponding ASON service trails according to the alarm information. NOTE
For optical-layer services on WDM devices, only BD_STATUS or MUT_LOS alarms on FIU boards can trigger rerouting. In other words, alarms on OTU boards cannot trigger rerouting.
Step 2 Restore the interrupted services. Route computation for optical-layer services is very complex because it must consider many factors such as optical parameter constraints, regenerators, and optical power. Therefore, Huawei uses preset restoration trails to recover ASON services at the optical layer of NG WDM equipment. A maximum of two preset restoration trails can be configured for each silver service. When a silver service at the optical layer is interrupted, the best choice to quickly recover the service is to optimize the service onto one of the preset restoration trails, or revert the service back to its original trail. The following describes how to quickly recover an interrupted service: 1.
In the WDM Trail Management window, select the service and click . Then save the service information (for example, the service attributes, current trail, and original trail) as an excel file. The information will be subsequently used as reference for fault diagnosis and service recovery.
2.
In the WDM Trail Management window, select the service. Check whether preset restorations are configured for the service. If they are configured, take the screenshots for the two preset restoration trails, because this information can be used as reference for reconfiguring preset restoration trails for the service in future fault diagnosis. In addition, check whether the current trail of the service is the same as one of the preset restoration trails. Ensure that the preset restoration trails are explicit trails during service optimization. The following figure shows a screenshot for one of the preset restoration trails.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
19
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
3.
If no restoration trail is configured for the service, optimize the trail for the service. During trail optimization, try to select a trail that has common nodes as the original trail and specify it as the explicit trail for the service. Then check whether the service is recovered. If a failure message is displayed or if a success message is displayed but the service is not recovered, go to the next step.
4.
If the service fails to be optimized or reverted back to the original trail, check for configuration errors (such as rerouting lockout and incorrect optical parameter settings, link verification failure/unavailability of control links not triggered by alarms). Then assess the current network topology, fiber status (for example, fiber breaks), and service source or sink node. If idle trails are available for the service, deactivate the service and then try to recover the service as a traditional service by creating a WDM trail for it. If the service still fails to be recovered, go to the next step.
5.
Query the traditional alarms corresponding to all ASON service trails. If there are some traditional alarms (for example, R_LOS, MUT_LOS, VOADATA_MIS, and OPA_FAILED), clear the traditional alarms according to the traditional alarm handling process. Then check whether the service is recovered. If not, go to the next step.
6.
Contact Huawei for troubleshooting. NOTE
For WDM ASON services, disable the function of computing optical parameters if the optical parameter check fails when you are optimizing the service. Then optimize the service again.
To disable the optical parameters, right-click the service in the ASON trail management window, and choose Disable Optical Parameters from the shortcut menu.
----End
2.3.2 Quick Recovery Process for ASON Services (OTN Electrical-Layer Service) Step 1 Identify the interrupted services.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
20
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
1.
According to the customer feedback, determine when service interruption occurs, how many services are interrupted in addition to the source and sink nodes and fiber status (for example, fiber break) of the network.
2.
Choose Configuration > WDM ASON > WDM Trail Management to display the WDM ASON trail management window. Identify the interrupted ASON service trail according to the service source or sink node information and ASON service alarms. If a CPW_ODU3_SER_INT, CPW_ODU2_SER_INT, CPW_ODU1_SER_INT, or CPW_ ODU0_SER_INT is generated, clear the alarm to recover the services.
3.
If no ASON service interruption alarm is reported, click in the lower right of the Browse Current Alarms window. After alarm synchronization is complete, check whether an ASON service interruption alarm is reported. If there is such an alarm, clear the alarm to recover the services.
4.
If no service interruption alarm is reported after alarm synchronization is performed, select and right-click all services and choose Synchronize from the shortcut menu. Check whether there are inactive services. If there are inactive services, then NMS data is inconsistent with the NE data. Activate these services if required. If activating these services fails, refer to Step 2.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
21
Huawei Optical Network Maintenance Reference WDM ASON
5.
2 Quick Recovery Guide for NG WDM ASON Services
If no ASON service interruption alarm has ever been reported, ensure that the static cross-connections are configured correctly for the client side of the NE. If alarms have been generated on the client side, ensure that the client-side boards are free of any faults or are online. Then, check for alarms and performance events on OTU boards. If there are alarms, locate the affected ASON services and rectify the faults.
Step 2 Restore the interrupted services. 1.
In the WDM Trail Management window, click Maintenance to sequentially refresh the current trail, original trail, and preset restoration trails. Then select the service and click . Then save the service information (for example, the service attributes, current trail, and original trail) as an excel file. The information will be subsequently used as reference for fault diagnosis and service recovery.
2.
In the WDM Trail Management window, select the service. Check whether preset restorations are configured for the service. If they are configured, take the screenshots for the two preset restoration trails, because this information can be used as reference for reconfiguring preset restoration trails for the service in future fault diagnosis. In addition, check whether the current trail of the service is the same as one of the preset restoration trails. Ensure that the preset restoration trails are explicit trails during service optimization. The following figure shows a screenshot for one of the preset restoration trails.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
22
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
3.
If no restoration trail is configured for the service, optimize the trail for the service. During trail optimization, try to select a trail that has common nodes as the original trail and specify it as the explicit trail for the service. Then check whether the service is recovered. If a failure message is displayed or if a success message is displayed but the service is not recovered, go to the next step.
4.
If the service fails to be optimized or reverted back to the original trail, check for configuration errors (such as rerouting lockout, link verification failure/unavailability of control links not triggered by alarms). Then assess the current network topology, fiber status (for example, fiber breaks), and service source or sink node. If idle trails are available for the service, deactivate the service and then try to recover the service as a traditional service by creating a trail for it. If the service still fails to be recovered, go to the next step.
5.
Query the traditional alarms corresponding to all ASON service trails. If there are some traditional alarms (for example, R_LOF, R_LOS, LOM, and BUS_ERR), clear the traditional alarms according to the traditional alarm handling process. Then check whether the service is recovered. If not, go to the next step.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
23
Huawei Optical Network Maintenance Reference WDM ASON
6.
2 Quick Recovery Guide for NG WDM ASON Services
Ty to deactivate/activate the ASON service. If the service is recovered after the deactivation/activation, reconfigure the service attributes and the preset restoration trails for the service based on the service information and screenshots saved previously.
----End
2.3.3 Quick Recovery Process for ASON Services (VC-4 Services) Locate the interrupted VC-4 ASON service according to the control plane alarms. Determine whether the alarms are an internal network alarm (CP_SER_INT) or an external network alarm (CP_SRV_INT_OUT). 1.
For an external network alarm, locate and handle the service faults on the client side. For example, check whether the service receiving boards are faulty or offline and whether errors have been generated in client signals.
2.
For an internal network alarm, handle the alarm as instructed in section 2.3.2 "Quick Recovery Process for ASON Services (OTN Electrical-Layer Service)".
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
24
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
If no control plane alarm indicating service interruption is reported, check for corresponding traditional alarms according to the service interruption time, and clear them to recover the services.
2.4 Troubleshooting Processes for Typical ASON Service Faults This section provides the troubleshooting processes for typical ASON service faults (for example, fiber breaks, device software or hardware faults that cause ASON service interruption) so that maintenance personnel can promptly recover the services by isolating the faults according to the troubleshooting processes.
2.4.1 Troubleshooting Process for Route Computation Failures Fault Description An ASON service (OCH, ODUK, or SDH) is interrupted, and the following alarms are reported for the end-to-end ASON service trail and the port on the WDM side: End-to-end ASON service alarms: CPW_OCH_SER_INT, CPW_ODUk_SER_INT, and CP_SER_INT Port alarms on the WDM side of affected boards: R_LOS, OTUk_LOF, OTUk_SSF, OTUk_LOF, OTUk_LOM, OUTk_DEG, ODUk_PM_SSF, ODUk_PM_OCI, ODUk_PM_DEG, IN_PWR_HIGH, IN_PWR_LOW, R_LOF, AU_AIS, BD_STATUS, MS_AIS End-to-end ASON service event: route computation failure event
Workarounds According to the fault occurrence time and the customer feedback, identify the interrupted services and promptly recover the services by, for example, optimizing the services or reverting them to their original trails. In events of a route computation failure or optical parameter check failure, preferentially check whether there are sufficient network resources. Normally, many factors can cause a shortage of network resources. They include: configuration errors, fiber breaks, residual cross-connections, device hardware faults, node faults, and software bugs. Currently, rerouting protection is provided for optical-layer services using preset restoration trails. If a service is interrupted and cannot be automatically recovered because of a route computation failure, manually recover them using the following methods: 1.
Find the current ASON service trail (including the preset restoration trail) of the service, and deactivate the service.
2.
Create a static service over a traditional WDM trail and recover the service using traditional commissioning methods. For an optical-layer service, try to use the original trail or configure two preset restoration trails. NOTE
If the preceding methods cannot recover the service, isolate the fault according to the symptoms (for example, fiber breaks, residual cross-connections, hardware faults, node faults, and software bugs) and then recover the service. For details about the troubleshooting procedure, refer to XXX Fault Location and Handling Procedure.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
25
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
Troubleshooting Procedure Step 1 Review the ASON service events that are generated during the time period in which the fault occurs, and check whether the "Route computation failed" or "Optical parameter check failed" error message is displayed. Step 2 If the "Optical parameter check failed" error message is displayed, it is possible that the preset restoration trails or the specified optical parameters about trails for service optimization are configured incorrectly. If this is the case, disable the optical parameter check function for the service, and optimize the service to recover it. NOTE
The route computation of the optical-layer service may return the error prompt about the failure of optical parameter check. Usually, the following optical parameters may be configured incorrectly: TE link distance, FIU dispersion compensation, dispersion coefficient, PMD coefficient, and rated power of the optical amplifier.
Step 3 If the "Route computation failed" error message is displayed, check whether all TE links are in normal state in the ASON TE link management window. If there are some TE link alarms (for example, CPW_XXX_TEL_DOWN, CPW_XXX_TEL_DEG, CPW_OMS_TEL_DEG, and CPW_OMS_TEL_DOWN), check the fiber connection status on the network. If some fibers have been broken, repair them. If no fiber has been broken, go to the next step. Step 4 If all TE links on the network are in normal state, check for node faults. If the NE is unreachable by the NMS or if there is a control plane alarm (for example, CPC_OSPF_NB_DOWN and CPC_RSVP_NB_DOWN), then you can determine that a node fault has occurred. Locate the node fault and rectify it. If no node fault has occurred, go to the next step. Step 5 If all network nodes are in normal state, check whether network resources are sufficient in the following ways: 1.
Check whether there are any reserved resources or residual cross-connections on the preset restoration trails or specified explicit trails.
2.
Check whether there are any resource inconsistency alarms (for example, CPW_XXX_TEL_PATHMIS).
If there are some reserved resources or residual cross-connections, manually release the reserved resources or delete the residual cross-connections. If there are no residual network resources, go to the next step. Step 6 If there are sufficient network resources, check whether network configurations are correct. For an optical-layer service, check the fiber connections at the optical layer, for example, the fiber connections between OAs, WSSs, OSCs, and FIUs. For an electrical-layer service, check whether the configured cross-connect capacity is beyond the limit permitted by the tributary, line, and cross-connect boards, and check whether any electrical boards are faulty. If the NE configurations are correct, contact Huawei for support. ----End
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
26
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
2.4.2 Troubleshooting Process for Rerouting Failures Fault Description An ASON service (OCH, ODUK, or SDH) service is interrupted. For details about the service alarms of the end-to-end ASON service trail and WDM port, see section 2.4.1 "Troubleshooting Process for Route Computation Failures." End-to-end ASON service event: rerouting failure event
Workarounds
For "Creation timeout": If timeout is caused by traffic congestion or a communication failure at the source or sink node, the service cannot be recovered by just optimizing the service to another trail. Instead, you have to deactivate/activate the service, or re-create an end-to-end traditional service.
For "RSVP egress or ingress port is down", explicitly specify different egress ports to optimize the service onto another trail so as to quickly recover the service.
The system returns the error prompt "Failure of label allocation". If the faulty node is the source or sink node of the ASON service, attempt to recover the interrupted service by optimizing the tunable wavelength for the optical layer. For the non-tunable NE configurations, identify the resources that conflict with the cross-connections or wavelengths of the current service and delete the conflicting resources to recover the interrupted service.
Troubleshooting Procedure Step 1 Review the ASON service events that are generated during the time period when the fault occurs, and check whether the "Creation timeout", "RSVP egress or ingress port is down" or "Label allocation failure" message is displayed. Step 2 If the "Creation timeout" message is displayed, check whether network nodes successfully communicate with each other. The nodes fail in control plane communication if the CPC_OSPF_AUTH_ERR, CPC_OSPF_NB_DOWN, CPC_RSVP_AUTH_ERR, or CPC_RSVP_NB_DOWN alarm is generated on the control plane. To restore the control plane communication, clear the alarm. If the network nodes communicate successfully with each other, check whether traffic congestion has occurred at the source node, sink node, or an intermediate node of the service using the following methods: 1.
Run the nbb-set-debug-mode:open command to the node to enable the debug function of the node.
2.
Run the mpls-query-rsvp-tcevent:0.0.0.0,0.0.0.0,0,1,2 command five times at intervals of 3 seconds.
If the same triplet information is displayed in the command output every time, then traffic congestion has occurred at the node. If traffic congestion has occurred at the source or sink node, deactivate/activate the service or perform a soft reset on the node to rectify the fault. If traffic congestion occurs at an intermediate node, recover the service by optimizing it to an available trail, then perform a warm reset on the node to rectify the fault. At this point, the "creation timeout" fault is rectified. Step 3 If the "RSVP egress or ingress interface Down" error message is displayed, first determine the faulty node according to the rerouting failure event. Then check whether the interface control
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
27
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
block of the control plane is in normal state and whether the RSVP interface information is complete using the following methods: 1.
Run the mpls-get-if:all command to query the interface information. Check whether there is an interface corresponding to the interrupted service and whether the interface information is correct (DEFINED=0x7, ADMIN=UP, OPER=UP; the remote information is not null).
2.
If the interface information is incomplete, run the :mpls-debug-command:get_lim_if,0,0,0,0,0; command to recover the interface information or perform a warm reset on the SCC board of the NE.
At this point, the "RSVP egress or ingress interface Down" fault is rectified. Step 4 If the "Failure of label allocation" error message is displayed, check whether there are conflicting resources on the network. Handle the fault using the following methods: 1.
Check whether there are any reserved resources or residual cross-connections on the preset restoration trails or specified explicit trails.
2.
Check whether there are any resource inconsistency alarms (for example, CPW_XXX_TEL_PATHMIS).
If there are some reserved resources or residual cross-connections, manually release the reserved resources or delete the residual cross-connections. At this point, the label allocation failure is resolved. ----End
2.4.3 Troubleshooting Process for Unreachable NEs Fault Description An ASON service (OCH, ODUK, or SDH) is interrupted, and the corresponding NE is dimmed in the main topology of the NMS) and an NE unreachable alarm is reported for the NE. As a result, users cannot operate and configure the NE. For details about the possible service alarms generated by the end-to-end ASON service trail and port on the WDM side, see section 2.4.1 "Troubleshooting Process for Route Computation Failures."
Workarounds After the ASON service is interrupted and the NE is unreachable by the NMS, identify the cause. If the NE is unreachable by the NMS because the gateway NE is faulty, check whether the secondary gateway is available or reconfigure another gateway NE for the unreachable NE. Then try to recover the service using the Navigator if the NE can be managed by the Navigator. The recovery steps are as follows: Step 1 Query alarms on the add/drop boards and the source/sink nodes of the affected services. Step 2 According to the source or sink node of the interrupted service, run a command to query and determine the corresponding affected ASON service (determine the affected ASON service according to the add/drop service board and ASON service alarm). Step 3 Recover the interrupted service quickly be optimizing the service (or reverting the service over to the original trail) manually.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
28
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
NOTE
Run the following command to query service alarms: :alm-get-curdata-ext:0,ALL,0
Run the following commands to query ASON services: :mpls-get-och-smlsp:src (optical layer) :mpls-get-otn-smlsp:src,all (ODUk electrical layer) :mpls-get-smlsp:src (SDH)
Run the following commands to optimize ASON services: :mpls-optim-och-lsp (optical layer), including the wavelength tuning command: mpls-adjust-och-lsp :mpls-optim-otn-lsp (ODUk electrical layer) :mpls-optim-lsp (SDH) mpls-restore-och-lsp (manually revert optical layer ASON services), mpls-restore-och-origrt (revert network-wide optical layer ASON services) mpls-restore-otn-lsp (manually revert ODUk services), mpls-restore-otn-origrt (revert network-wide ODUk services) mpls-restore-lsp (manually revert SDH services), mpls-restore-original-rt (revert network-wide SDH services)
Recover the service quickly by optimizing the service or reverting the service over to the original trail.
Step 4 If neither the NMS nor the Navigator is available but trails are available for the service, you can disable the SC2 and OA laser of the reachable node on an available trail to trigger service switching, or perform a cold reset on the FIU and remove the line fiber to trigger service rerouting, finally restoring the interrupted service. ----End
Troubleshooting Procedure An NE may become unreachable by the NMS for many reasons: IP address conflict, subnet mask errors, OSC/ESC communication fault between NEs, SCC hardware or software faults of the NE, ECC storm, and faults of the customer's outband DCN. You can simply troubleshoot this fault by determining whether the NE becomes unreachable by the NMS because of the fault of the gateway NE. If yes, replace the faulty gateway NE. For other types of faults, refer to the DCN Recovery Guide. NOTE
To check for network communication faults, run the following commands: :cm-get-iproute (query the routing table) :cm-set-ip (modify the IP address) :cm-set-submask (modify the subnet mask) :cm-get-gateway (modify the gateway NE)
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
29
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
2.4.4 Troubleshooting Process for Interruption of a Single-wavelength length ASON Service Fault Description An ASON service (OCH, ODUK, or SDH) service is interrupted and alarms are reported. For details about the service alarms of the end-to-end ASON service trail and WDM port, see section 2.4.1 "Troubleshooting Process for Route Computation Failures." In the ASON service management window, only a single ASON service is interrupted. You can determine whether the service initiates the rerouting process according to the corresponding performance event (rerouting success or failure event) and service alarm (ASON service interruption alarm, "not over the original trail" alarm, and rerouting lockout alarm). The possible symptom of service interruption is that the rerouting process is not initiated or the ASON service is still interrupted even after the rerouting process is initiated.
Workarounds For details about the workarounds, see section 2.4.1 "Troubleshooting Process for Route Computation Failures." NOTE
If the interrupted single-wavelength ASON service does not initiate the rerouting process, it is possible that no trail meets the requirement or a component (for example, board or fiber) is faulty (the current ASON service only supports the rerouting switchover triggered by a line-class fault rather than a channel-class fault).
Troubleshooting Procedure Step 1 According to the fault information (for example, fault time, faulty node, and current alarm) provided by the customer, determine whether a single ASON service or multiple ASON services are interrupted through the MUT_LOS alarm and the R_LOS alarm of multiple boards. If a single ASON service is interrupted, proceed to the next step. Step 2 If the ASON service has the protection capability (traditional protection plus ASON associated service), check whether the service is locked to an abnormal channel forcibly. If yes, unlock the service and check whether the service is recovered. If a normal channel is available, switch the service toward the normal channel forcibly. Alternatively, check whether the switching conditions are configured correctly (SD). If not, correct the switching condition configurations and check whether the service is recovered. If the service is still not recovered, proceed to the next step. Step 3 Check the interrupted ASON service. If rerouting is locked for the ASON service, unlock rerouting. The ASON service may still fail to initiate the rerouting process (it is possible that a channel alarm causes the service interruption; if the service has an available trail, optimize the service manually), or the ASON service is still interrupted even after the ASON service initiates the rerouting process (by default, an optical-layer alarm does not trigger service rerouting). In this case, proceed to the next step. Step 4 According to the interrupted service information, check whether the service trail must traverse a repeater, and check whether the current trail has traversed a repeater. If the current trail does not traverse a repeater, you can specify a repeater to optimize the service for restoring the service. If the service trail does not need to traverse a repeater or the current trail has traversed a repeater, but the ASON service is still interrupted, proceed to the next step.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
30
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
Step 5 Determine whether a service of the same wavelength causes any crosstalk. You can disable the OTU at the source or sink node of the interrupted service or disable the laser of the trunk board to determine whether another service board generates any same-wavelength crosstalk section by section. Alternatively, ask the customer's maintenance personnel to check whether the wavelength of a board is adjusted at the OTU of the source or sink node, on a trunk board, or in an intermediate site of the service, or whether any service board is inserted. According to the check results, adjust the conflicting wavelength to recover the interrupted service. If the ASON service is still interrupted, or there is no conflicting wavelength, proceed to the next step. Step 6 Check whether the OTU configurations at the source node are consistent with those at the sink node and whether the trunk board configurations (for example, wavelength, rate, FEC, and TDC) between two ends are consistent with each other. If some configurations are inconsistent, correct the configurations immediately. If the configurations are all consistent but the ASON service is still interrupted, proceed to the next step. Step 7 Check whether a power adjustment alarm is generated for the service trail, for example, an OPA adjustment failure alarm (OPA_FAIL_INDI). If yes, check the corresponding configuration information (including the WSS board, optical amplifier, and preset insertion loss) of the OPA reference section. If there are some configuration errors, correct the configurations as instructed by the System Commissioning Guide. If no adjustment failure alarm is generated, or if the corresponding configurations are correct but the ASON service is still interrupted, proceed to the next step. Step 8 Determine the current trail for the interrupted service. Check the power of the OTU (including the trunk board) at the source or sink node of the single-wavelength service. Then query the current and historical performance of the OTU at the receiving or transmitting end. If the current network is configured with an MCA board, use the MCA board to scan the single-wavelength power in the service trail and check whether any power is abnormal in the service trail. If the current network is not configured with any MCA board, check whether the receiving power of the corresponding port is normal (performing a hardware loopback) on site. If a faulty fiber or board is identified, replace the faulty fiber or board. Then, the ASON service is recovered. NOTE
When checking the faulty trail, note the following alarms and configuration information: Multiplexer or demultiplexer board alarm and optical amplifier board alarm: RLOS, MUT-LOS, BD_STATUS, MODULE_ADJUST_FAIL, MOD_COM_FAIL, WAVEDATA_MIS, IN_PWR_LOW, IN_PWR_HIGH, OUT_PWR_HIGH, and OUT_PWR_LOW Multiplexer or demultiplexer board and optical amplifier board configurations: insertion loss, attenuation, rated power, gain, and dispersion compensation. Rectify a board fault (if available) as instructed by the Board Replacement Guide, and rectify a fiber fault (if available) as instructed by the Fiber Repair Guide.
----End
2.4.5 Troubleshooting Process for Interruption of Multi-Wavelength Services Fault Description ASON services (OCH, ODUK, or SDH) are interrupted and alarms are reported. For details about the service alarms of the end-to-end ASON service trail and WDM port, see section 2.4.1 "Troubleshooting Process for Route Computation Failures."
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
31
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
In the ASON service management window, multiple ASON services are interrupted. You can determine whether the service initiates the rerouting process according to the corresponding performance event (rerouting success or failure event) and service alarm (ASON service interruption alarm, "not over the original trail" alarm, and rerouting lockout alarm). The possible symptom of service interruption is that the rerouting process is not initiated or the ASON service is still interrupted even after the rerouting process is initiated.
Workarounds For details about the workarounds, see section 2.4.1 "Troubleshooting Process for Route Computation Failures."
Troubleshooting Procedure Step 1 According to the fault information (for example, fault time, faulty node, and current alarm) provided by the customer, determine whether a single ASON service or multiple ASON services are interrupted by checking the related alarms, for example, the MUT_LOS alarm and the R_LOS, IN_PWR_HIGH, or IN_PWR_LOW alarm of multiple OTUs. If multiple ASON services are interrupted, proceed to the next step. Step 2 Check each interrupted ASON service. If rerouting is locked for the ASON service, unlock rerouting. if the ASON services initiate the rerouting process and are recovered, the fiber on the line side is faulty. Therefore, rectify the line fault. If the ASON services still fail to initiate the rerouting process or the ASON services are still interrupted even after the ASON services initiate the rerouting process, proceed to the next step. Step 3 If the ASON services are still interrupted even after the rerouting or switching process is initiated, a board on the site is faulty, the line attenuation is extremely large, or the pigtail is faulty. If multiple OTUs generate the R_LOS, IN_PWR_HIGH, or IN_PWR_LOW alarm, the input optical power is obviously changed as compared with the historical performance. Check whether the input optical power of the demultiplexer unit in the upstream of the OTU is changed as compared with the historical performance. If the input optical power of the demultiplexer board is not changed, check the pigtail between the demultiplexer board and the OTU. If the pigtail is abnormal, replace the abnormal pigtail. If the pigtail is normal, check whether the demultiplexer board is faulty and replace the faulty demultiplexer board. Otherwise, proceed to the next step. Step 4 In the drop direction on the site, check whether the input optical power of the demultiplexer board is changed obviously. If yes, continue to check whether the input power and output power of the optical amplifier board in the upstream are changed. If the output power of the optical amplifier board remains stable, clean or replace the pigtail between the optical amplifier board and the demultiplexer board. If the output power of the optical amplifier board is changed and the input power of the optical amplifier board is stable, check whether the optical amplifier board is configured with any gain and whether the laser is disabled. For the OAU, also check whether the connection insertion loss between the TDC and the RDC is changed. If the settings are all normal, you can determine that the optical amplifier board is faulty and replace the faulty optical amplifier board. Otherwise, proceed to the next step. Step 5 Check whether the input power of the optical amplifier board is abnormal. If the input power is abnormal, check whether the input power or output power of the FIU is changed. If the output power of the FIU is stable, check the connection insertion loss between the TDC and the RDC and check whether the adjustable attenuation is changed for the OAU. If the insertion loss and configurations are normal, clean or replace the pigtail between the optical amplifier board and the FIU. If the output power of the FIU is changed, check the input power of the FIU. If the input power is stable, you can determine that the FIU is faulty and replace the faulty FIU. Otherwise, proceed to the next step.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
32
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
Step 6 Check whether the input power of the FIU is changed. If the input power is changed, check whether the output power of the upstream FIU is changed. If the output power of the upstream FIU is stable, check whether the line attenuation between the FIUs of two sites is changed. Otherwise, proceed to the next step. Step 7 If the output power of the upstream FIU is changed, perform the preceding steps to check the upstream site continuously. Observe the following principles for troubleshooting: In the direction contrary to the signal flow, check the power reported by the FIU, optical amplifier board, and multiplexer board. According to the power change point, determine the board or pigtail where the output power begins to change. If a board causes the power change, replace the board. If a pigtail causes the power change, clean or replace the pigtail. Step 8 If an attenuator board is available in the line, check whether the attenuator board is configured correctly. If the actual attenuation of the attenuator board does not match the configured attenuation, replace the attenuator board. ----End
2.4.6 Troubleshooting Process for Malfunctioning SCC Boards Fault Description The ASON service (OCH, ODUK, or SDH) is interrupted, only a single NE is unreachable by the NMS, and the service that uses the NE as its source or sink node cannot be recovered.
Workarounds If the unreachable node is an intermediate node of the interrupted service and a redundant trail is available for service restoration, recover the interrupted service as instructed in section 2.4.1 "Troubleshooting Process for Route Computation Failures." If the unreachable node is the source or sink node, or essential intermediate node of the interrupted service, you must rectify the node fault while restoring the interrupted service.
Troubleshooting Procedure Step 1 If the unreachable NE is an intermediate node of the interrupted service, handle the fault as instructed in section "Workarounds". If the unreachable NE is the source or sink, or essential intermediate node of the interrupted service, proceed to the next step. Step 2 Check whether the fibers connected to the NE are all broken according to the alarm information of an adjacent NE. If the fibers are all broken, restore the interrupted fiber links first. If all fibers are normal, proceed to the next step. Step 3 Check whether the DCN configurations of the network or the NE have been recently modified. If yes, check whether the new IP address, mask, gateway, and DCC pass-through are correct, and recover the interrupted service according to the correct configurations. If the DCN configurations are not modified, proceed to the next step. Step 4 Check whether the unreachable NE is only unreachable by the NMS intermittently. An NE may be reset repeatedly because of abnormal software or configurations. If the NE is configured with double SCC boards, attempt to recover the interrupted service by initiating active/standby switchover for the SCC boards. If the NE is configured with only a single SCC board or if the ASON service is still interrupted even after the active/standby switchover is initiated, proceed to the next step.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
33
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
Step 5 If you cannot locate and rectify the fault by taking all the preceding steps, check the hardware on site and make preparations for replacing boards and restoring the database. For details, refer to the Appendix: ASON Node Troubleshooting Process. ----End
2.4.7 Troubleshooting Process for a Failure to Recover Interrupted ASON Services Because of Add/Drop Channel Faults Fault Description An ASON service (OCh, ODUk, or SDH) on an NE is interrupted, and a control plane or traditional alarm that indicates the service interruption is generated on the NE. The service cannot be recovered by manually optimizing the service trail or by deleting the ASON service and reconfiguring a traditional service for the original channel. In addition, the add and drop channels cannot be configured for the source and sink nodes or the channels are faulty. NOTE
If a channel on an intermediate node is unavailable, you can switch the service on the channel to another channel to preferentially recover the service. For details on how to recover the service, see the service quick recovery process. This section only provides the service recovery process for add/drop channel faults at the source or sink node.
Workarounds Take the following workarounds when an add/drop channel at the source or sink node is unavailable:
If other trails are available on the access side for carrying the service, help the customer switch the service to one of the trails to preferentially recover the service.
If no other trails are available on the access side, replace the board that provides the add/drop channel at the source or sink node and reconfigure an end-to-end trail to first recover the service.
Troubleshooting Procedure Step 1 Identify the interrupted ASON service according to the fault symptom (including the fault occurrence time, faulty node, and current alarms). Step 2 Switch the service to another available trail using the trail optimization function of the NMS, or directly delete the ASON service and then reconfigure an end-to-end trail using the original channel resources. If the control plane alarm or traditional alarm that indicates the service interruption persists, go to the next step. Step 3 Check for traditional alarms and determine whether the faulty channel is used for adding or dropping a service according to the traditional alarm handling method. After identifying the type of the faulty channel, go to the next step. Step 4 If other trails are available on the access side for carrying the service, help the customer switch the service to one of the trails and ensure that the service is recovered. If no other trails are available on the access side, go to the next step. Step 5 Replace the board that provides the add or drop channel at the source or sink node and reconfigure an end-to-end trail for the service. After ensuring that the new trail is free of fault, switch the service to the new trail.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
34
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
----End
2.5 Appendix: ASON Node Troubleshooting Process 2.5.1 General Principles for Handling ASON Node Faults If a node-class fault occurs in an NE, the general troubleshooting principle is as follows:
Recover the interrupted service first if possible, and then rectify the node fault.
If the interrupted service cannot be recovered temporarily, rectify the node fault first and then attempt to recover the interrupted service.
Simply speaking, if the faulty node is the source node or sink node of the interrupted service or is an essential intermediate node, rectify the node fault first and then attempt to recover the interrupted service. A node-class fault may be caused for the following reasons: 1.
Because the database configurations are lost or corrupted, the NE is reset repeatedly and cannot be started. Subsequently, the NE enters the BIOS state, DCN mode, installation mode, or database protection mode. In this scenario, the management plane, control plane, and service plane all fail.
2.
Because the NE software is corrupted or lost, the SCC board undergoes abnormal warm resets and cannot start. In this scenario, the management plane, control plane, and service plane all fail.
2.5.2 Fault Description For an ASON node fault caused by the database corruption or loss, the NE usually becomes unreachable by the NMS if no service is interrupted. In this case, perform preliminary troubleshooting on the NMS side and determine whether the NE becomes unreachable by the NMS because of a DCN fault. The following section provides some preliminary judgment principles: 1.
Check whether a single NE, multiple NEs, or all NEs in the same subnet are unreachable by the NMS.
2.
Check whether the NEs are unreachable by the NMS persistently or intermittently.
Many NEs are unreachable by the NMS on a large scale usually because of a DCN fault. If a single NE or few NEs are unreachable by the NMS persistently, the probable cause is a node fault. Therefore, further locate and handle the fault on the site. On the faulty site, you can query the operating status of the NE according to the LED indicator of the SCC board or using the LCT/Navigator to connect to the device. NOTE
Check whether the operating status of the NE is Running by running the cfg-get-nestate command.
2.5.3 Procedure for Restoring Databases Using the Real-Time Database Backup Function Step 1 Determine the criteria for making database restoration solution. For the database restoration solution, ensure that the database is backed up in real time or periodically. During the operation of an ASON service, the action that is automatically triggered by the control plane can also trigger the changes in the database. For an ASON
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
35
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
service node, check whether the backup database is the updated to the latest. If the ASON service is not interrupted, the database restoration process may cause service interruption. For the preceding reasons, assess the risks of the database restoration solution when a node fault occurs. The assessment items are as follows: 1.
Current service status (no service is interrupted, few services are interrupted, and many services are interrupted)
2.
Extent to which the customer can tolerate the impacts of service interruption, and whether the customer can provide a service maintenance window when services are interrupted
If no service is interrupted, and the customer cannot tolerate the impacts of service interruption, use a more complex recovery solution. For details, see section 2.5.4 "Restoring the Configurations Manually." The section only describes the scenario in which a backup database is available. The scenario in which a backup database is unavailable is equal to Scenario 3. For details, see section2.5.5 "Restoration Process If No Backup Database Is Available." NOTE
As configured in the SCC boards, the following database restoration scenarios are available:
Scenario 1: Use the backup database in the CF card or on the NMS side
Scenario 2: New user configurations are available after the database is backed up
Scenario 3: No backup database is available, and the service is configured manually
The prerequisite for the database restoration solution is that a backup database is available. Normally, the updated database backs up the data of the previous day. Therefore, timed database backup and manual database backup should be an integral part of routine maintenance of ASON services. The NE data should be backed up every day, and the data of at least one recent month should be stored. After the ASON service is operated, the database must be backed up manually.
Step 2 Make preparations for restoring NE database. After an ASON node fault occurs or if you need to further verify that a node fault occurs on site, make the following preparations before going to the site: 1.
Prepare a PC where the LCT, DC, or Navigator tool is installed, and copy the backup database of the unreachable node (if there are multiple databases that are backed up at different time on the NMS side, copy all the backup databases to the portable computer and prevent corruption of the updated backup data).
2.
Make preparations for replacing the damaged SCC board as instructed in the section "Replacing the SCC Board" in the Parts Replacement. NOTE
Both active and standby SCC boards are available for each ASON node. If an ASON node is faulty, the possible cause is that both the active and standby SCC boards are damaged. Preferably, prepare two standby SCC boards.
3.
Obtain the DCN configuration data (including the NEID, NEIP, subnet mask, NODEID, and OSPF IP address) from the NMS side, and use such configuration data to set basic parameters of the system after the damaged SCC hardware is replaced.
4.
Save the alarm information of the faulty node on the NMS side and use the alarm information for alarm comparison after the node fault is rectified. Check whether any new alarms are available after the node fault is rectified.
5.
Save the ASON service information on the NMS side if the faulty node is the source node of the ASON service; the ASON service information can be exported as a report.
After arriving the site, perform the following operations to replace hardware:
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
36
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
1.
Replace the faulty SCC board or clear the database (for details, see section "Replacing the SCC Board" in the Parts Replacement)
2.
After reaching the site, check whether the system works normally according to the LED indicator status of the SCC board (for details, refer to the Hardware Description about indicators) or the access device.
3.
If the active SCC board cannot work normally and the indicator status of the standby SCC board is normal, remove the active SCC board and use the standby SCC board as active to rectify the fault.
4.
If the standby SCC board works normally, check whether the interrupted service is recovered through the NMS. If the standby SCC board also cannot work normally, replace the SCC board (return the faulty SCC board to the R&D department for fault location) as instructed in the section "Upgrading the SCC Board" in the Parts Replacement.
5.
If no standby SCC board is available, clear the database of the SCC board through the DIP as instructed in section "Upgrading the SCC Board" in the Parts Replacement (if the SCC board nevertheless cannot be started normally, replace the SCC board).
On the site, set basic parameters for the system: To replace the SCC board hardware, log in to the device through the LCT or Navigator tool of the NMS to modify the basic parameter settings for the device, including the NEID, NEIP, subnet mask, NODEID, and OSPF IP address (Note: These parameters must be obtained beforehand from the NMS side). If the SCC board hardware does not need to be replaced, skip the step. 1.
Connect the portable computer to the Ethernet management port of the NE (NM_ETH of the WDM product), and log in to the device by using the WEBLCT or Navigator.
2.
Set basic parameters for the NE, including the NEID, NEIP, subnet mask, NODEID, and OSPF IP address.
Run the cm-set-ip command to set the IP address of the NE.
Run the cm-set-sndip command to set the OSPF IP address.
Run the cm-set- submask command to set the subnet mask of the NE.
Run the cfg-set-gcpnodeid command to set the NODEID parameter.
Run the cm-set-neid command to set the NEID parameter (Note: For certain products, the SCC board is automatically reset once after the NEID parameter is set).
Step 3 Restore the database. 1.
Restore the backup database in the CF card
If the NE is started after the database is cleared, run the sftm-show-dir command to query the backup database in the dbbackup directory in the CF card.
Run the :pe-reover-data-ext:17, db, "db100718.gz command to restore the configurations. If the configurations are recovered successfully, the SCC board is automatically reset (17 indicates the bit ID of the SCC board, and db100718.gz indicates the database backup filename that is queried by running the sftm-show-dir command).
If the NE can be started normally, proceed to the next step to check whether the interrupted service is recovered. If the NE cannot be started normally, clear the database and attempt another database backup. If none of the backup databases is available, proceed to the next step.
2.
Download and restore the database
Log in to the faulty NE through the NMS (because the SCC board is replaced or the database is cleared, the NE user information is all lost; therefore, you must reconfigure
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
37
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
the NE user information; add an NE account first by running the sm-add-user command).
After logging in to the NE, choose System > NE software management > NE data backup or restoration from the main menu of the NMS client. Then, the NE view is displayed.
Select the faulty NE, click Restore. The Restore view is displayed.
In the Filename drop-down list, select the file to be restored. Select the file that is backed up on the most recent date for restoration. You can select Browse to view the file.
Set Activation type to Restart.
After you select Deliver the configurations to the board, the
Click Start to restore the database.
After the database is restored successfully, proceed to the next step to check whether the interrupted service is recovered. If the NE cannot be started normally, clear the database and attempt another database backup. If none of the backup databases is available, restore the configurations manually.
button becomes
.
Step 4 Verify service restoration. After the system is started normally, an interrupted static service can be recovered. After a node fault is rectified, verify the service recovery by performing the following operations: 1.
In the ASON Trail Management window of the NMS, synchronize the ASON service, check whether the ASON service alarm is cleared, check whether the service attributes are correct, and check whether all interrupted static services are recovered. If some ASON services are nevertheless interrupted, attempt to restore the ASON services as instructed in section 2.3 "Quick Recovery Process for ASON Services". If the service trails do not meet the original planning requirements, adjust the service trails through optimization.
2.
Perform health check in the entire network, and ensure that the node fault is rectified without causing other faults. NOTE
During the node fault, the database on the NMS does not store the updated data if a dynamic service is changed (for example, newly created, optimized, or rerouting). After the database is restored through a backup database, it is possible that the service is lost or the service trail does not conform to the planning. In this case, manual intervention is required.
----End
2.5.4 Restoring the Configurations Manually The section describes how to manually restore configurations in the scenario where a node fault occurs but the service is not interrupted. The operation prerequisites are as follows: Step 1 The service is not interrupted. Step 2 Backup databases are available for the faulty node, but it is impossible to determine whether the databases are backed up in real time. Note that the restoration process is complex and time-consuming, and must be performed under the guidance of R&D personnel or be performed by R&D personnel. In addition, the restoration mode varies in different cases. Therefore, the following section only describes the operation principles rather than the detailed operation procedure.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
38
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
----End
Determining Whether the Database is Backed Up in Real Time Step 1 After obtaining the backup database of the faulty node, the R&D personnel download the database to a mirroring environment for restoring the configurations. Step 2 Check the static cross-connection configurations of the faulty node according to the information saved on the NMS. Step 3 According to the upstream or downstream node information, check the cross-connection configurations of the ASON service that traverses the node. As shown in the following figure, the R&D personnel download the backup database of the NE13 node to a mirroring environment for restoring the configurations after the NE13 node is faulty. After the configurations are restored, a NE11-NE13-NE12 ASON service should have an ASON cross-connection between port 1 on slot 4 and port 1 on slot 2 on the NE13 node. The detailed channel or wavelength information can be obtained from the upstream NE11 node or from the downstream NE12 node. In this way, the R&D personnel can verify the cross-connections of all the ASON services that traverse the NE13 node.
----End
Restoring the Configurations When the Database Is Not Backed Up in Real Time Step 1 If the static cross-connections are not consistent with the information saved on the NMS, add the missing static cross-connections. Step 2 If the ASON cross-connections are not consistent, degrade the nodes in a mirroring environment and add the missing static cross-connections. ----End
Restoring the Database Step 1 On the NMS, degrade all the ASON service that traverses the node (if the node is an intermediate node, degrade the ASON services at its source node or sink node; if the node is the source node, degrade the ASON service at its sink node. If the node is the sink node, degrade the ASON service at its source node). Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
39
Huawei Optical Network Maintenance Reference WDM ASON
2 Quick Recovery Guide for NG WDM ASON Services
Step 2 Use the database backup that is handled in real time to restore the database for the faulty node. Step 3 Upgrade the previously degraded services into ASON services again. ----End
Verification of Service Restoration and Follow-Up Operations After the node fault is rectified, verify the service restoration by performing the following operations: Step 1 In the ASON Trail Management window, synchronize the ASON services, and ensure that no ASON services are lost or redundant, and the ASON services are activated normally. Check whether the ASON service attributes are correct. If not, correct the service attributes. Check whether the configuration data (for example, preset restoration trails) is lost and whether the original trail information is lost. If some information is lost, reconfigure such information according to the information saved on the NMS side. Step 2 Check whether each ASON service generates any alarm. If yes, locate and handle the fault according to the alarm information. Perform health check in the entire network, and ensure that the recovery does not cause any other fault. ----End
2.5.5 Restoration Process If No Backup Database Is Available When a node is faulty but no backup database is available for the node or when the backup database of the faulty node is corrupted, the maintenance personnel must restore the interrupted service according to the configuration information on the NMS side. The process of restoring the configurations manually is similar to the process of rectifying the ASON node fault that does not cause any service interruption.
Add the missed static cross-connections according to the static cross-connection configurations on the NMS side and the upstream or downstream node information of the ASON services that have transited the node.
Use the downloaded database to restore the configurations.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
40
Huawei Optical Network Maintenance Reference WDM ASON
3
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.1 General Fault Diagnosing Procedure 3.1.1 Overview In case of a service interruption, first confirm the basic service information (for details, see "Obtaining Information About the Faulty Service" in section 3.1.3 "Troubleshooting Procedure"), restore the service, and collect the basic fault information such as historical alarms and events for fault diagnosis. This section describes only the procedure for diagnosing interruptions of optical-layer ASON services. It does not apply to interruptions of optical-layer static services.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
41
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.1.2 Flowchart
3.1.3 Troubleshooting Procedure Detecting a Service Interruption In case of an ASON service interruption on NG WDM equipment, the fault symptoms are as follows:
The client equipment reports the service interruption.
The network management system (NMS) reports a CPW_OCH_SER_INT alarm.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
42
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Obtaining Information About the Faulty Service Maintenance personnel must confirm the following information as soon as possible to perform quick service restoration:
Start time of the service interruption.
Specific service that is interrupted, protection level and route of the service, and whether the service traverses regeneration boards.
Whether the service is an ASON service or a static service.
Alarms that are generated during the service interruption.
Whether the service is rerouted if the service is an ASON service and whether alarms indicating that the service is not on the original path are reported.
Performing Quick Service Recovery After confirming the basic information, restore the service quickly by referring to chapter 2 "Quick Recovery Guide for NG WDM ASON Services."
Obtaining Historical Alarms Perform the following operations on the NMS to collect historical alarms that are reported during a service interruption. Step 1 Choose Fault > Browse History Alarm on the main menu. In the Filter window that is displayed, click the Basic Setting tab and select all check boxes in the Severity and Type areas and retain default values for other parameters. Then click OK.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
43
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Step 2 Right-click in the Browse History Alarm window that is displayed, choose Select All from the shortcut menu. Then choose Save > Save All Records to save all historical alarms into an Excel file.
----End Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
44
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Identifying OTU Boards Where Abnormal Alarms Are Reported During the Service Interruption Check the abnormal alarms in the queried historical alarms to identify the OTU boards where service-affecting alarms are reported during the service interruption and the start time and end time of these alarms. Check for the following alarms on OTU boards:
WDM-side port-level alarms: R_LOS, OTUk_LOF, OTUk_LOM, OTUk_EXC, OTUk_DEG, OTUk_SSF, IN_PWR_HIGH, and IN_PWR_LOW
Board-level alarms: BD_STATUS, BUS_ERR, HARD_BAD, HARD_ERR, and COMMUN_FAIL NOTE
If the listed alarms are not in the historical alarm list, the WDM side functions properly. When this occurs, check for an electrical-layer service fault or a client-side fault. The detailed fault diagnosis is not provided in this document.
Locating the Specific Faulty ASON Service Obtain the slot ID, port number, and wavelength information about the OTU boards based on the reported alarms to locate the specific faulty ASON service. Then collect information such as the service ID, service adding and dropping sites, boards, ports, and wavelengths that the service traverses for further fault diagnosis. NOTE
If the service is not in the ASON service list, the service may be an end-to-end static service. When this occurs, check for a static service fault. The detailed fault diagnosis is not provided in this document.
Obtaining Historical Events On the NMS, choose Fault > Browse Events Log from the main menu. Right-click the window that displays the queried historical events, and choose Save > Save All Records to save all historical events into an Excel file.
Checking for Rerouting Events After locating the specific faulty ASON service, check for service rerouting events in the historical event list and handle the fault accordingly. 1.
If the ASON service is successfully rerouted, the service is unavailable after being rerouted to the current path from another path. When this occurs, see section 3.2 "Diagnosing the Fault that a Service Is Unavailable After Being Successfully Rerouted" to check for a system fault but not a fault in the ASON protocol.
2.
If the ASON service is rerouted but rerouting fails, see section 3.3 "Diagnosing the Fault that Service Rerouting Fails" to find the cause for the rerouting failure.
3.
If the ASON service is not rerouted (in other words, no rerouting event is reported), see section 3.4 "Diagnosing the Fault that a Service Is Not Rerouted" to diagnose the fault accordingly.
If there is no event during the service interruption in the historical event list, it is probable that the communication between the NE and the NMS is abnormal or there are too many network events. When this occurs, contact Huawei R&D engineers to collect the ASON log at the first node of the faulty service to obtain the service rerouting information.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
45
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.1.4 Exception Handling Process If there is no abnormal service alarm on Huawei WDM equipment when a service is interrupted, the possible causes are as follows:
The client equipment is malfunctioning.
The OTU board for adding or dropping services is malfunctioning
3.2 Diagnosing the Fault that a Service Is Unavailable After Being Successfully Rerouted 3.2.1 Overview When a service is unavailable after being rerouted, check the alarm status on the OTU boards where the service is added and dropped and that on the regeneration board where the service traverses. Locate the faulty point by analyzing the alarm status on the OTU, regeneration, and other boards on the service path. In this scenario, the ASON protocol is generally running properly and the probable faulty point is the WDM side.
3.2.2 Flowchart
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
46
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.2.3 Troubleshooting Procedure Step 1 Check whether the IN_POWER_HIGH or IN_POWER_LOW alarms are reported at the WDM-side port on the OTU or regeneration board and whether OTUk-level SF alarms such as OTUk_LOF or OTUk_LOM are reported. If the preceding alarms are reported, the service interruption is caused by abnormal optical power on the line. Check and rectify the system optical power fault. If the fault cannot be diagnosed, proceed to the next step. Step 2 Confirm the path of the faulty ASON service. If the path has been confirmed (for example, the NMS screenshot has been taken), contact Huawei R&D engineers to collect logs. Step 3 Check whether MUT_LOS alarms are reported on the FIU board and R_LOS alarms are reported on the OTU boards on the new path after rerouting. If the preceding alarms are reported, see section 3.4 "Diagnosing the Fault that a Service Is Not Rerouted" to diagnose the fault accordingly. If the alarms are not reported, proceed to the next step. Step 4 Check whether optical power adjust (OPA) failure alarms are reported on the path. If the alarms are reported, the possible causes are as follows:
The path insertion loss is over high.
The path insertion loss is over low.
When this occurs, see section 3.5.1 "Diagnosing OPA Adjust Failures" to diagnose the fault. Step 5 Check whether MUT_LOS alarms are reported on some optical-layer boards and R_LOS alarms are reported on OTU boards on the path. If the preceding alarms are reported, the possible causes are as follows:
A fiber at a site is interrupted.
Physical fibers are incorrectly connected. For example, fibers between the OSC and FIU boards are incorrectly connected, fibers between optical and electrical subracks are incorrectly connected in optical-electrical subrack separation scenarios, and fibers at a site are incorrectly connected.
The OPA function incorrectly delivers attenuation. See section 3.5.2 "Diagnosing Incorrect Attenuation Delivery of OPA" to diagnose and rectify the fault accordingly.
An optical component is malfunctioning.
Step 6 Check for the following faults if no R_LOS but OTUx_LOF alarms are reported on the OTU boards where the service is added and dropped or on the regeneration board:
The line optical power is abnormal.
The FEC types or ODUk rates on the OTU boards where the service is added and dropped mismatch.
The wavelength configuration at the receive end mismatches that at the transmit end. Do as follows to check the wavelength configuration of 40G boards at the receive end: Choose Configuration > WDM Interface in the NE Explorer of the NMS, click By Board/Port (Channel), and click the Advanced Attributes tab. If the returned wavelength value is not 0xff (this means that the wavelength configuration at the receive end has been manually modified) and the returned wavelength value is different from the wavelength value configured at the transmit end, you can confirm that the fault cause is wavelength configuration inconsistency.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
47
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
There are wavelength conflicts. When this occurs, check whether physical fibers on the TN11RMU9 board are correctly connected.
Step 7 Check for the following faults if no R_LOS but OTUx_Exc and OTUx_Deg alarms are reported on the OTU boards where the service is added and dropped or on the regeneration board:
There are excessive bit errors on the path.
The line optical power is abnormal.
Step 8 Check for the following faults if COMMUN_FAIL, MOD_COM_FAIL, HARD_ERR, and MODULE_ADJUST_FAIL alarms are reported on some boards on the path:
The boards are not properly inserted.
Optical components on the boards are malfunctioning.
If the fault is still not diagnosed, contact Huawei R&D engineers. ----End
3.2.4 Exception Handling Process A board on the service path may be malfunctioning. When this occurs, scan each wavelength using the MCA board to locate the malfunctioning board. The detailed fault diagnosis is not provided in this document.
3.3 Diagnosing the Fault that Service Rerouting Fails 3.3.1 Overview A service rerouting failure is generally caused by lack of available paths on the network because wavelength resources are insufficient or abnormal alarms are generated on links.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
48
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.3.2 Flowchart
3.3.3 Troubleshooting Procedure Step 1 In case of a service rerouting failure, the general cause is that no resource is available on the network or creating the service fails. Therefore, contact Huawei R&D engineers to obtain the ASON logs for locating the path of the service. Step 2 Check the rerouting logs on each site along the path for service creation failure records. Step 3 If no service creation failure record is found, verify the following resource information on ASON NEs: Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
49
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Whether the feature of ASON route computation using optical parameters is enabled. If the feature is enabled, disable the feature.
Wavelength resources (optical cross-connection configurations).
Available paths (fiber connection information).
Available TE links (links where no abnormal alarm is generated).
Basic configuration information such as FEC type, ODUk service rate, and optical module information on OTU boards (including regeneration boards).
----End
3.3.4 Exception Handling Process If the fault cannot be diagnosed, it may be caused by a NE or ASON software defect. When this occurs, contact Huawei R&D engineers for fault diagnosis.
3.4 Diagnosing the Fault that a Service Is Not Rerouted 3.4.1 Overview When a service is not rerouted after it is interrupted, the possible causes are as follows:
The service is locked. (Confirm this possibility immediately after the service is interrupted.)
The service is not a silver service. (Confirm this possibility immediately after the service is interrupted.)
A fiber at a site is interrupted.
To diagnose the fault, collect the ASON logs on all NEs on the path.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
50
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.4.2 Flowchart
3.4.3 Troubleshooting Procedure If a service is interrupted but is not rerouted, it is probable that no fiber cut occurs on the line side. In this case, you can diagnose the fault using the following procedure: Step 1 Analyze the ASON logs to locate the service path in details. Step 2 Check for MUT_LOS or R_LOS alarms on the FIU or OSC boards on the path against the historical alarm list. If the alarms are not reported, there is no fiber cut on the line and proceed to the next step. Step 3 Check whether the IN_POWER_HIGH or IN_POWER_LOW alarms are reported at the WDM-side ports on the OTU boards where the service is added and dropped and whether OTUk-level SF alarms such as OTUk_LOF or OTUk_LOM are reported. If the preceding alarms are reported, the service interruption is caused by abnormal optical power on the line. Check and rectify the system optical power fault. Step 4 Check whether MUT_LOS alarms are reported on some optical-layer boards and R_LOS alarms are reported on OTU boards on the path. If yes, see step 5 in section 3.2.3 "Troubleshooting Procedure" to diagnose the fault. Step 5 Check whether no R_LOS but OTUx_LOF and OTUx_LOM alarms are reported on the OTU boards where the service is added and dropped or on the regeneration board. If yes, see step 6 in section 3.2.3 "Troubleshooting Procedure" to diagnose the fault. Step 6 Check whether no R_LOS but OTUx_Exc and OTUx_Deg alarms are reported on the OTU boards where the service is added and dropped or on the regeneration board. If yes, see step 7 in section 3.2.3 "Troubleshooting Procedure" to diagnose the fault.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
51
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Step 7 Check whether OPA adjust failure alarms are reported on the path. If yes, see section 3.5.1 "Diagnosing OPA Adjust Failures" to diagnose the fault and obtain operation logs on the NE where the alarms are reported. Step 8 Check whether COMMUN_FAIL, MOD_COM_FAIL, HARD_ERR, and MODULE_ADJUST_FAIL alarms are reported on some boards on the path. If yes, the boards reporting the alarms are malfunctioning. ----End
3.4.4 Exception Handling Process If the fault still cannot be diagnosed, it is probable that communication between subracks on an NE is abnormal, communication between the SCC board and other boards is abnormal, or the NE, board, or ASON software has a defect. When this occurs, contact Huawei R&D engineers.
3.5 Common Operations Involved in Fault Diagnosis 3.5.1 Diagnosing OPA Adjust Failures Step 1 Obtain optical power at the reference point, insertion loss of the service path, and attenuation on the VAx board. The information includes:
Logical and physical board configuration information on the NE.
Permitted attenuation adjustment range and the specific adjustment values of the VOAs and EVOAs on the path. The navigation path on the NMS is Configuration > WDM Interface.
Insertion loss of the boards on the path. For the obtaining method, contact Huawei R&D engineers.
Nominal input and output optical power of the OA boards on the path. The navigation path on the NMS is Configuration > WDM Interface.
Input and output optical power of the malfunctioning OTU board. The navigation path on the NMS is Configuration > Optical Power Management.
OPA preset insertion loss (available for OptiX OSN 8800 V100R005 and later versions) on the path. For the obtaining method, contact Huawei R&D engineers.
Fiber connections on the NE. Do as follows to obtain this information: −
Issue 02 (2014-08-26)
In the NE Explorer, click the NE icon.
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
52
Huawei Optical Network Maintenance Reference WDM ASON
−
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Choose Configuration > Fiber/Cable Synchronization in the navigation tree. In the Synchronized Fiber/Cable area, select all fiber connections and copy all the fiber connections into an Excel file.
DCN configurations for optical and electrical NEs in separated optical and electrical NE scenarios. For the obtaining method, contact Huawei R&D engineers.
Step 2 Assess whether the path information satisfies OPA adjust requirements. For the OPA working principle, see OPA in the Feature Description manual for OptiX OSN 8800. ----End
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
53
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
3.5.2 Diagnosing Incorrect Attenuation Delivery of OPA Collect required logs and diagnose the fault using the following procedure: Step 1 Query optical cross-connections on all NEs on the path. In the NE Explorer of an NE on the NMS, click the NE icon and choose Configuration > Optical Cross-Connection Management in the navigation tree. Click the NE-Level Optical Cross-Connection tab to view the NE-level cross-connection configurations. Check whether the optical cross-connections are configured in manual mode. If yes, the OPA on the path is malfunctioning. If OPA is not configured on the path, go to step 3. Step 2 Check whether the probable fault point is an optical NE with separated optical and electrical subracks. If yes, contact Huawei R&D engineers to check the inter-NE OPA configuration. Check whether the optical NE contains information such as optical power and attenuation about the electrical NE. If the optical NE contains such information, proceed to the next step. If no such information is reported at the port that the faulty service traverses, contact Huawei R&D engineers to obtain the logs in the ofs1 area on the SCC board of the optical NE and send back the logs to Huawei for analysis. Step 3 Query the attenuation configuration on all NEs on the path. Check for abnormal attenuation values on the path. For the operations involved in this step, contact Huawei R&D engineers. Step 4 Query the insertion loss configuration on all NEs on the path. Check for excessively high insertion loss or preset insertion loss values on the boards on the path. For the operations involved in this step, contact Huawei R&D engineers. Step 5 If the attenuation and insertion loss are normal, check whether the intra-board route information about the WSS board that the service traverses is normal. You need to obtain the following logs:
The bb1.log, OPLOG, and ERRLOG.log files in the OFS1 area on the active SCC board.
The bb10.log file in the OFS1 area on the WSS board. If the WSS board is TN11 series, obtain the bb10.log file of the board; if the WSS board is TN12 or TN13 series, obtain the bb10.log file of the SCC board in the subrack housing the WSS board. For example, to obtain the bb10.log file of the TN11WSM9 board, run the :log-query:bid,"bb0.log" command. To obtain the bb10.log file of the TN12WSM9 board housed in slot 1 in subrack 2, run the :log-query:2-18,"bb0.log" command in which 2-18 indicates the ID of the slot housing the SCC board in subrack 2.
Check for communication exceptions, abnormal board resets, abnormal intra-board routes, and abnormal attenuation delivery records in the logs. If there are faults occurring at the time when the service is interrupted, the faults are probable causes to the service interruption. Generally, a communication exception is caused because a board is not properly inserted or cables between subracks are incorrectly connected. Abnormal board resets may be caused by inappropriate manual operations. If there is no route information on the WSS board, the cause may be a communication failure. Step 6 If no exception is found in the logs, check for the following faults:
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
54
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
1.
The logical fiber connections are inconsistent with the physical fiber connections.
2.
A board is malfunctioning.
3.
The attenuation of the pigtail is over high.
----End
3.5.3 Confirming Information About a Faulty Service In case of a service interruption, locate the path of the service and determine whether the service is an ASON service as soon as possible using the following procedure: Step 1 On the NMS, choose Configuration > WDM ASON > ASON Tail Management from the main menu.
Step 2 In the Filter window that is displayed, select only the OCh check box in the Level area and click Filter All. The ASON Trail Management window is displayed.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
55
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
The following describes each area in the ASON Trail Management window: Area 1 displays the sink and source NEs of each service, the OTU boards where services are added and dropped, the wavelengths that carry services, and the current alarm status of each service. Area 2 displays the current path, associated path, preset restoration path, and original path of a specific service after you select the service in area 1. Area 3 displays the path of a specific service on the network topology after you select the service in area 1. Step 3 Locate the source and sink NEs, service adding and dropping OTU boards, and wavelength of a service whose Alarm Status is Critical Alarm.
Step 4 After locating the service path, check whether Activation Status is Active, Class is Silver, and Rerouting Lockout is Unlocked for the service. The service can be rerouted only when these attributes are the specified values. Step 5 In area 2, locate the FIU boards that the service traverses in two directions. Obtain the IDs of slots housing the FIU boards and check whether MUT_LOS and BD_STATUS alarms are reported on the FIU boards in the historical alarm list.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
56
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
CAUTION
Take complete screenshot of the NMS, as shown in the figures above.
Take the screenshot immediately after the service interruption occurs.
----End
3.5.4 Obtaining Performance Data Step 1 On the NMS, choose Performance > Query WDM Performance from the main menu. In the left navigation tree, select the ASON domain where the fault occurs from Physical Root and click
.
Step 2 Click the 15-Minute option button. Step 3 Click the Current Performance Data tab to obtain the current performance data. Step 4 Click the History Performance Data tab to obtain the historical performance data. Select all check boxes in the Performance Event Type and Display Options area and click Query.
Step 5 Click Save As to save the query result in an Excel file. Step 6 Repeat the preceding steps to obtain the current and historical 24-hour performance data. ----End
3.6 Information to Be Collected Collect the following information for fault diagnosis: Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
57
Huawei Optical Network Maintenance Reference WDM ASON
3 Fault Diagnosis for ASON Service Interruptions on NG WDM Devices
Historical alarms
Current alarms
Events
Logs in the mfs/log, mfs/ion, ofs1/log, ofs2/log, and ofs2/gcp directories on the SCC board
OTU board configurations
NE optical-layer configurations such as boards, optical cross-connections, attenuation, and insertion loss
Basic resource information in the ASON protocol
bb10.log files on some optical-layer boards
CAUTION The space for saving log files is limited and logs will be overwritten after they are saved for a specified time. Therefore, collect the required logs as soon as possible in case of a service interruption so that the logs that are recorded during the service interruption will be not overwritten.
3.7 References The following lists the reference documents and manuals:
Feature Description for OptiX OSN 8800
ASON User Guide
Alarms and Performance Events Reference
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
58
Huawei Optical Network Maintenance Reference WDM ASON
4
4 Typical ASON Service Troubleshooting Cases
Typical ASON Service Troubleshooting Cases
4.1 ASON-Specific Operations and Configurations 4.1.1 Case 1: Creating an ASON Service Fails Because the Wavelength for the Service Is Reserved Fault Description A network uses Huawei's OptiX OSN 6800 devices, for which optical-layer ASON is enabled. A 10G optical service is planned for the link from SITE_C to SITE_B. According to the network plan, the 10G service is a silver service configured with two associated trails. For the service, SNC/N protection is configured on the client side. The working trail is planned as SITE_C –> SITE_2 –> SITE_B and wavelength 10 is planned for the working trail; the protection trail is planned as SITE_C –> SITE_1 –> SITE_A –> SITE_B and wavelength 50 is planned for the protection trail. When configuring the SNC/N protection, the customer successfully creates the working trail but fails to create the protection trail.
Network Topology The following figure shows the network topology.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
59
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Cause Analysis A fault diagnosis shows that Resource Reservation is enabled for No. 50 wavelength at SITE_A. After wavelength 50 at SITE_A is no longer reserved, Huawei has successfully created the protection trail.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check for abnormal alarms at the sites on the protection trail. If there are no abnormal alarms, go to the next step. Step 2 Check TE links and ensure that they are up. Step 3 Optimize the service that is successfully created (the service over wavelength 10) to the protection trail. If this operation is successful, go to the next step. Step 4 Check NE-level optical cross-connections of the sites on the protection trail. If no optical cross-connections are configured for No. 50 wavelength, go the next step. Step 5 Check whether the resources (wavelength 50 in this example) of a site on the protection trail, for example, SITE_A, is reserved. If it is reserved, release it.
----End
Conclusion and Suggestion The initial design for the network, wavelength 50 is planned for carrying a traditional service. To prevent this wavelength being occupied by an ASON service during rerouting, the network designer specifies wavelength 50 at SITE_A as a reserved wavelength. However, the service plan is changed later and wavelength 50 is used to carry an ASON service, but wavelength 50 is not released. As a result, creating the protection trail fails. For a wavelength that is planned to carry a traditional service, it is recommended that users reserve this wavelength at a site that the service traverses to prevent this wavelength being
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
60
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
occupied by an ASON service during rerouting. If the service plan is changed, users must release this wavelength for the FIU board at the site.
4.1.2 Case 2: ASON NEs Reset Because of a Duplicated OSPF IP Address Fault Description A network is relatively large and uses Huawei's OptiX OSN 6800 devices, for which optical-layer ASON is enabled. During a specific time period, some NEs reset, and each time the NEs are different.
Network Topology N/A
Cause Analysis A fault diagnosis shows that two NEs on the network use the same OSPF IP address. In this situation, the OSPF protocol runs abnormally. As a result, NEs on the network reset.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Review the reset logs to check whether the OSPF protocol runs normally when separate optical and electrical NEs are configured. If the OSPF protocol runs abnormally, go to the next step. Step 2 Check the OSPF IP addresses of network-wide NEs. If two NEs use the same OSPF IP address, correct the OSPF IP addresses for the two NEs. ----End
Conclusion and Suggestion This issue is related about configurations and can be prevented through correct configuration. Plan the OSPF IP addresses, IP addresses, NEID, NODEID for NEs and check the information clearly to ensure that they are unique. Then configure NEs in strict compliance with the network plan.
4.1.3 Case 3: An Attempt to Start the TE Link Management Window Times Out Fault Description A network uses Huawei's OptiX OSN 8800 devices and consists of several sub-networks, for which WDM ASON, OTN ASON, and SDH ASON are enabled. The three sub-networks are managed using the same NMS. When users attempt to start the WDM ASON TE link management window using the NMS, timeout often occurs.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
61
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Network Topology N/A
Cause Analysis A fault diagnosis shows that duplicated node IDs are used for site D. As a result, the OSPF protocol repeatedly creates and deletes links and too many link change events are reported to the NMS. Eventually, timeout occurs because of a large amount of data has been generated on the NMS.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check the NMS logs to verify that an NE reports a large amount of (for example, 7000) events within one second. Step 2 Query the performance events. If there are link bandwidth change events and link basic information events, check the basic attributes of the links by checking the parameters in the performance events. If the basic attributes keep unchanged, go to the next step. Step 3 Check link-related operations. If there are records showing that some links are deleted or added for the same NE, it is probable that duplicated node IDs have been used for the links. Step 4 Query the node IDs of all NEs using the NMS. If two NEs use the same NODEID, reconfigure the node IDs for the two NEs according to the network plan. ----End
Conclusion and Suggestion An operation timeout issue generally occurs under the following conditions: 1. The NMS receives too many messages. 2. The operation-targeted NE is busy so that it fails to respond to a message sent by the NMS. When the NMS receives too many messages, it is generally due to the following configurations: 1.
Issue 02 (2014-08-26)
More than one NE in one ASON domain is configured as the communication NE for communicating with the NMS. (In general, only one NE should be configured as the communication NE in one ASON domain. The following figure shows an example of correct NE configuration in one ASON domain.)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
62
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
2.
ECC messages are transmitted between different ASON domains. To verify this information, you can check whether the source and sink NEs are in the same domain and whether the OSPF protocol is enabled, as shown in the figure below.
3.
Duplicated node IDs are used for the NEs in the same ASON domain.
4.1.4 Case 4: Service Protection Level Fails to Change After a Line Board Participating in the Protection Is Moved to Another Slot Fault Description A network uses Huawei's OptiX OSN 8800 devices, for which electrical-layer ASON is enabled. A 10G service is planned for the link from SITE_C to SITE_A. According to the network plan, the 10G service is a diamond electrical ODU2 service. For the service, the working trail is planned as SITE_C –> SITE_B –> SITE_A and the protection trail is planned as SITE_C –> SITE_E –> SITE_D –> SITE_A. During network deployment, the service is created successfully. Later, the ND2 board in slot 12 at SITE_C must be moved to slot 2 due to some reason. Before moving the board, the customer changes the protection level of the service from diamond to silver. After moving the board from slot 12 to slot 2, the customer establishes the fiber connections properly and verifies the TE links on the board. Then the customer attempts to change the protection level of the service back to diamond; however, the NMS reports an error message.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
63
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Network Topology The following figure shows the network topology.
Cause Analysis A fault diagnosis shows that the logical board for the ND2 board in slot 12 at SITE_C is retained after the ND2 board is moved from slot 12 to slot 2. As a result, the signaling interface cannot be updated accordingly, leading to a failure to change the service protection level back to diamond. After the logical board is deleted, the signaling interface is updated accordingly. At this point, an operation of changing the service protection level back to diamond can be performed successfully.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check the site name for which the NMS reports an error message. (In this example, the site name is SITE_C.) Step 2 Check information about the TE links between SITE_C and SITE_B to ensure that the TE links are in normal state. In addition, verify that the ND2 board in slot 2 at SITE_C and the ND2 board in slot 7 at SITE_B are mutual remote ends. Step 3 Check the signaling interface information. If the remote end of the TE link on the ND2 board in slot 2 at SITE_C is empty but the remote end of the TE link on the ND2 board in slot 12 at SITE_C is the ND2 board in slot 7 at SITE_B, go to the next step. Step 4 Check whether the logical board for the ND2 board that is originally installed in slot 12 at SITE_C is retained. If yes, delete it, then change the service protection level back to diamond. ----End
Conclusion and Suggestion After moving a line board from a slot to another slot on a network where electrical-layer ASON is enabled, for example, the ND2 board at SITE_C on the working trail, delete the logical board for the line board, and check whether the TE link information for the signaling interface has been updated accordingly. If there are sufficient network resources, you can configure another trail as the working trail for the service before moving the line board. After
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
64
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
moving the line board, reconfigure the trail on which the line board is moved to another slot as the working trail.
4.1.5 Case 5: Deleting Inter-NE Fiber Connections Fails When Separate Optical and Electrical NEs Are Configured Fault Description A network uses Huawei's OptiX OSN 6800 devices, for which optical-layer ASON is enabled. When separate optical and electrical NEs are configured for the network, the customer attempts to use the NMS to delete a fiber connection between an OTU board on an electrical NE and an M40 board on an optical NE; however, the NMS reports a failure message.
Network Topology N/A
Cause Analysis A fault diagnosis shows that a 10G optical-layer ASON service is configured for the OTU board. This ASON service leads to a failure to delete the fiber connection between the OTU board and the M40 board.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Identify the source and sink nodes, slots, and ports of the fiber connection to be deleted. Step 2 Check the configurations of the source and sink nodes, boards, and ports. If an optical-layer ASON service is configured for either the source or sink port, downgrade the ASON service to a traditional service. Then delete the fiber connection. ----End
Conclusion and Suggestion When separate optical and electrical NEs are configured, it is not allowed to delete a fiber connection between the optical and electrical NEs if an ASON service is configured or resource reservation is enabled for the source or sink port of the fiber connection. To delete such the fiber connection, use the following methods: 1.
If an ASON service is configured for the fiber connection, first downgrade the ASON service into a traditional service. Then delete the fiber connection.
2.
If no ASON service is configured but TE links at either end of the fiber connection are reserved for ASON services, change the value of Revertive Mode to Non-Revertive for all the ASON services on the NE, as shown in the following figure. Then you can delete the fiber connection.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
65
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
4.1.6 Case 6: Route Computing Fails During Creation, Optimization, or Rerouting of an ASON Service or During an Upgrade of a Static Service to an ASON Service Fault Description The following dialog box is often displayed when an attempt is made to create, optimize, or reroute an ASON service, or to upgrade a static service to an ASON service.
Network Topology N/A
Cause Analysis There are the following possible causes for a route computation failure:
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
66
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
1.
The control link is unreachable. In other words, there are no physical paths from the source to the sink.
2.
A link fault has occurred. For example, if a link break or downgrade fault occurs when the ASON software searches for routes between the source and sink nodes, the ASON software will be unable to find the sink node.
3.
No idle channels are available. For optical-layer ASON, no end-to-end uniform idle timeslots are available.
4.
The add wavelengths are duplicated with the drop wavelengths at the source or sink node.
5.
Fiber connections are configured incorrectly for the source and sink nodes or for the intermediate nodes.
6.
If regeneration boards are configured at the optical layer, the possible causes are: a) Logical fiber connections are incorrectly configured for the regeneration boards. b) The optical module types of the regeneration boards do not match the optical module types of the add/drop boards. c) The service rates of the regeneration boards do not match the service rates of the add/drop boards. d) The FEC settings of the regeneration boards are inconsistent with the FEC settings of the add/drop boards.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check whether the control link is reachable. In the NE Explorer, choose Configuration > WDM ASON > WDM Control Link Management to check information about the control link. If node that the ASON service has to traverse is isolated, handle the control link fault to ensure that the control link is reachable. Step 2 In the NE Explorer, choose Configuration > WDM ASON > WDM Control Link Management. Check all of the TE links that the ASON service may traverse and ensure that Alarm Status is No Alarm and Link Status is Up for the TE links. If there are any TE link faults, handle them before performing the next step. Step 3 Check whether idle channels are available. First, determine the trails that the ASON service may traverse through visual inspection. Then check the channel status (either on FIU or OTU boards) for the trails one by one. In addition, ensure that channels are not reserved. The following figure shows an example for navigating to the channel information on FIU boards.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
67
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Step 4 Check whether add wavelengths are duplicated with drop wavelengths at the source or sink node. Ensure that each wavelength is used for carrying only one service in the same direction. Step 5 Ensure that all fiber connections are configured correctly. Focus on checking the fiber connections for the newly inserted boards after the deployment commissioning. Step 6 If regeneration boards are used, check the configurations of the regeneration boards. Ensure that the fiber connections of the regeneration boards are configured correctly. Then check the optical module types, service rates, and FEC settings of the regeneration boards to ensure that they match those of the add/drop boards. ----End
Conclusion and Suggestion Routes can be computed successfully for ASON services only when the logical fiber connections are correctly configured and there are idle link and channel resources. If route computation fails, it is generally due to a link fault or the required channels are unavailable. Therefore, to pinpoint the root cause of the route computation failure, users must check the status of the required link and channel resources. It is recommended that preset restoration trails be configured for ASON services. By configuring preset restoration trails, users can determine the trail to which an ASON service can be rerouted and at the same time ensure the quality of the trail to which the ASON service is rerouted.
4.1.7 Case 7: A Newly Created ASON Service Fails to Traverse a Node or an Existing ASON Service Fails to Traverse a Node During Trail Optimization or Rerouting Fault Description A newly created ASON service fails to traverse a node or an existing ASON service fails to traverse a node during trail optimization or rerouting. The failure message shows that the outbound interface on the specified trail is down.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
68
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Network Topology N/A
Cause Analysis Computing an end-to-end trail can be successful only when the information about network-wide links is complete regardless of whether information about a link interface of a node is missing. When a trail for a service (either a newly created service or an optimized or rerouted service) traverses the node, the control plane checks the correctness of the interface information for the node. The check fails since the link information is missing and therefore the end-to-end service fails to be created.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check the error code using the NMS or by running commands and verify that the error code indicates that the outbound interface of the service is down. Step 2 Identify the node where the interface is missing. Then use the NMS or run commands to retrieve information about the node. Step 3 Check whether the node information contains the interface information and whether the interface information is complete. Step 4 If the interface information is incomplete, perform a warm reset on the SCC board on the NE. ----End
Conclusion and Suggestion This case provides a typical fault diagnosing procedure based on error information. In this case, users must understand how an ASON trail is established and the definitions of error codes. With this understanding, users can quickly locate the fault and correctly handle it.
4.1.8 Case 8: LMP Protocol Check Fails Due to DCN Errors and Consequently Service Deployment Fails Fault Description A network uses Huawei's OptiX OSN 6800 devices, for which electrical-layer ASON is enabled. NEs on the network are connected using optical amplifier boards. SITE-A is the first site. When an attempt is made to create an ODU2 ASON silver service between SITE-A and SITE-B, the ASON software responds with a route computation failure message, and the NMS displays an error code of 40497.
Network Topology The following figure shows the network topology.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
69
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
The following figure shows the fiber connections inside SITE-A and SITE-B.
Cause Analysis A fault diagnosis shows that at SITE-A the line board connecting to SITE-B has DCN errors. As a result, LMP protocol check fails for the line board and the TE links on the line board are in abnormal state.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Specify SITE-A's line board that connects to SITE-B as the explicit board and create an ASON service. If the system responds with a route computation failure message and the NMS displays an error code of 40497, go to the next step. Step 2 Check the links generated on the line board. If ODU2 links on the line board are not displayed in the TE link management window, go to the next step. Step 3 Check information about the line board. If lots of DCN errors are generated, replace the board. ----End
Conclusion and Suggestion This case provides a typical issue wherein creating ASON services fails because TE links are unreachable due to hardware faults.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
70
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Therefore, before deploying an ASON service, ensure that the boards that the planned service trail traverses are working properly.
4.1.9 Case 9: ASON Services Fail to Be Deployed Because Line Attenuation Is Excessively High Fault Description When users attempt to create optical-layer ASON services from NE1 to NE3, error code 40497 is displayed indicating route computation fails.
Network Topology The following figure shows the networking diagram.
NE2
NE1
NE3
Cause Analysis A fault diagnosis shows that line attenuation between NE1 and NE2 is excessively high and therefore the input optical power of FIU boards on NE2 is lower than the minimum value. As a result, TE links between NE1 and NE2 are interrupted and the ASON services fail to be deployed.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check TE links along the service flow. If a TE link is faulty, go to the next step. Step 2 Create traditional services from NE1 to NE3. If creating traditional services fails, go to the next step. Step 3 Check optical power along the planned path. If the input optical power of a board (for example, the FIU board on NE2) is lower than the minimum value, go to the next step. Step 4 Check the upstream NE of the board (NE1in this example) and ensure the optical power at the transmit end is within the permitted range. If no cross-connection is configured on the upstream NE, the optical power at the transmit end may be insufficient. At this point, create two NE-level static cross-connections on the upstream NE (NE1 in this example) as planned so that the input optical power of the board (the FIU board on NE2 in this example) is within the permitted range. Step 5 After TE links are working properly, configure ASON services.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
71
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
----End
Conclusion and Suggestion If line attenuation is excessively high, the input optical power of an FIU board will be lower than the minimum value. As a result, TE links are interrupted and therefore ASON services fail to be deployed. If ASON services fail to be deployed, create end-to-end static services on the planned path. If creating static services fails, check optical power along the path.
4.1.10 Case 10: An Error Message Is Displayed When Users Attempt to Create Virtual TE Links Fault Description In a scenario where optical NEs and electrical NEs are separated, the following error message is displayed when users attempt to create virtual TE links between optical NEs and electrical NEs.
Network Topology N/A
Cause Analysis The possible causes are as follows: 1.
No OSPF IP address is configured for the out-band channel that is used for the communication between optical NEs and electrical NEs.
2.
ETH control ports on optical NEs and electrical NEs are disabled.
3.
The Link Management Protocol (LMP) is enabled for the link where boards involved in the required virtual TE links are located.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Verify that an OSPF IP address is configured for the out-band channel that is used for the communication between optical NEs and electrical NEs. Step 2 Verify that the OSPF of the ETH control ports on optical NEs and electrical NEs are enabled. Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
72
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Step 3 Verify that the LMP is disabled for the boards that function as edge points between optical NEs and electrical NEs. ----End
Conclusion and Suggestion This fault results from incorrect configurations and can be avoided by strict compliance with configuration requirements. Note that the OSPF IP address needs to be planned and the following operations must be completed before a virtual TE link is created: Step 1 Set the OSPF IP address for the out-band channel that is used for the communication between optical NEs and electrical NEs.
Step 2 Set OSPF Protocol Status of the ETH control ports on optical NEs and electrical NEs to Enabled.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
73
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Step 3 Set LMP Protocol Status of the boards that function as edge points between optical NEs and electrical NEs to Disabled.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
74
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
----End
4.1.11 Case 11: Route Computation Fails After Explicit Resources Are Specified to Create or Optimize an ASON Service Fault Description The following error message is displayed indicating route computation fails after explicit resources are specified to create or optimize an ASON service.
Network TopologyThe following figure shows the network topology.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
75
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Cause Analysis Explicit resources must be specified along the direction from the source to the sink of a service. If explicit resources are specified in the reverse direction, reroute computation will fail. If explicit resources are randomly specified without a specific direction, the ASON software has to compute many possible routes, which extends computation time and deteriorates performance.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Verify that explicit resources are specified in the direction from the source to the sink of services. For example, for an ASON service from SITE_1 to SITE_5, route computation fails if SITE_3 is specified as the first explicit node and SITE_2 as the second explicit node. This is because explicit resources are not specified in the direction from the source to the sink.
----End
Conclusion and Suggestion When specifying explicit resources for a service, specify them along the direction from the source to the sink of the service.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
76
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
4.2 ASON Service Restoration 4.2.1 Case 1: An ASON Service Is Interrupted Because OPA Fails Fault Description A network uses Huawei's OptiX OSN 6800 devices, for which ASON is enabled. Wavelength 2 carries a silver service. As shown in the network topology, the red line represents the original trail for the silver service and the green line represents the preset restoration trail (SITE_A -> SITE_E -> SITE_B). After a fiber cut occurs between SITE_A and SITE_D, the service is rerouted to the preset restoration trail. The rerouting succeeds, but the service is unavailable.
Network Topology The following figure shows the network topology.
Cause Analysis A fault diagnosis shows that the attenuation between the OTU board and the FIU board at SITE_A is larger than the maximum value. As a result, optical power adjust (OPA) fails and therefore the service is interrupted after being rerouted to the preset restoration trail.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check for alarms on the trail to which the service is rerouted. If an OPA_FAIL_INDI alarm is generated at a site (SITE_A in this example), go to the next step. Step 2 On the NMS, obtain the optical power, attenuation, and insertion loss of each board, and the permitted adjustment range of each EVOA over the trail (TN12OBU2-TN11RDU9-TN13WSM9-TN12OAU1 in this example). Step 3 Check whether the attenuation of the trail satisfies OPA requirements based on OPA rules. If the attenuation of the trail is beyond the OPA range and only the service is transmitted over the trail, change the attenuation of the VOA on the trail (the VOA of the TN12OAU1 in this example) so that the attenuation of the trail is within the OPA range. Then activate the service.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
77
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
----End
Conclusion and Suggestion After an ASON service is rerouted to a new trail, OPA automatically adjusts the attenuation of the new trail. If the total insertion loss of all boards at a site is larger than the maximum value, or the attenuation of VAx boards is larger than the maximum value, OPA will fail and therefore the service is interrupted. To avoid this fault, take the following suggestions: 1. During deployment, verify the insertion loss of each board over the original trail and preset restoration trail of an ASON service. 2. In the earlier stage of deployment, set the attenuation of VAx boards over an ASON trail to the minimum value.
4.2.2 Case 2: An ASON Service Is Interrupted Because Protection Switching Fails After a Second Fiber Cut Fault Description A network uses Huawei's OptiX OSN 6800 devices, for which optical-layer ASON is enabled. A 10G optical service is planned for the link from SITE_A to SITE_C. According to the network plan, the 10G service is a silver service configured with two associated trails. For the service, SNC/I protection (in revertive mode) is configured on the client side. As shown in the network topology, the red line represents the working trail, the blue line represents the protection trail, and the green line represents the trail to which the service is rerouted after a fiber cut. After a second fiber cut, protection switching fails and the service is interrupted.
Network Topology The following figure shows the network topology.
Cause Analysis A fault diagnosis shows that the SNCP protection type is incorrect (SNC/N should be configured other than SNC/I). As a result, protection switching fails after a second fiber cut.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
78
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Analyze the fault and check the two fiber cuts. In this example, the first fiber cut occurs on the link between SITE_1 and SITE_B, triggering SNCP protection switching. The service is successfully rerouted to a new trail (the green line) and this trail becomes the working trail. The second fiber cut occurs on the link between SITE_A and SITE_C. No other trails are available and therefore rerouting fails, further leading to a protection switching failure on the client side. Consequently, the service is interrupted. Step 2 Pinpoint the cause for the protection switching failure. In this example, after the first fiber cut occurs, the service is rerouted and traverses SITE_C which is an electrical regeneration site. SNC/I protection is configured for the service. Note that the regeneration site regenerates SM overheads and therefore SNC/I protection fails to detect SM overheads. Consequently, protection switching fails to occur at SITE_B. ----End
Conclusion and Suggestion This fault results from an incorrect SNCP protection type. SNC/I protection is for inherent monitoring and the protection switching depends on the SM overhead status; SNC/N protection is for non-intrusive monitoring and the protection switching depends on the SM, TCM, or PM overhead status. On a network where sites are enabled with optical-layer ASON, the working and protection trails of a service will change after the service is rerouted. If an electrical regeneration site is configured on the trail to which the service may be rerouted, configure SNC/N protection for the service so that the service is under protection after it is rerouted to the trail.
4.2.3 Case 3: An ASON Service Is Interrupted After Being Rerouted Because of Incorrect Fiber Connections Fault Description A network uses Huawei's OptiX OSN 6800 devices, for which optical-layer ASON is enabled. On the network, wavelength 2 carries an optical ASON service on the link SITE_A -> SITE_D -> SITE_B (the red line in the network topology). After a fiber cut occurs on the link between SITE_A and SITE_D, the service is rerouted to the link SITE_A -> SITE_E -> SITE_B (the green line) but the service is interrupted after the rerouting.
Network Topology The following figure shows the network topology.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
79
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Cause Analysis A fault diagnosis shows that this fault results from an incorrect fiber connection on the network. In this example, a logical fiber connection is configured between port 1 on the OSC board in slot 1 and the FIU board on the green line, but port 1 on the OSC board is physically connected to the FIU board on the blue line. As a result, a cross-connection is created from SITE_A to the FIU board on the blue line instead of the FIU board on the green line. As a result, the source and sink of the service have different cross-connection information and the service is interrupted after being rerouted.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Check for abnormal alarms on the trail to which the service is rerouted. (In this example, the OTU board on the green line but not the FIU board reports an R_LOS alarm.) If an abnormal alarm is generated, go to the next step. Step 2 Check cross-connections at each site on the trail. In this example, optical cross-connections configured at SITE_A do not match the physical fiber connections. If a cross-connection is incorrectly configured, go to the next step. Step 3 Check the ASON control link topology. If the topology is correct, go to the next step. Step 4 Compare the physical fiber connections with the logical fiber connections for each site. In this example, the physical and logical fiber connections of FIU boards are inconsistent. After modifying the fiber connections, the fault is rectified. ----End
Conclusion and Suggestion The ASON software selects a trail for an ASON service based on logical fiber connections. If physical and logical fiber connections are inconsistent, the service may be interrupted. This fault is difficult to locate. To avoid this fault, take the following suggestions: 1. During network construction or maintenance, verify that logical and physical fiber connections on the network are consistent. 2. Before configuring a preset restoration trail for an ASON service, ensure that the configurations for the trail are correct.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
80
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
4.2.4 Case 4: An ASON OCh Trail in a Slave Subrack Is Interrupted but Not Rerouted After the Slave Subrack Is Powered Off Fault Description As shown in the following figure, a CPW_OCH_SER_INT alarm is generated on an ASON OCh trail, indicating service interruption occurs.
Network Topology N/A
Cause Analysis Rerouting of an ASON OCh trail is triggered upon the following conditions: an FIU board on the trail is offline or reports a MUT_LOS alarm. After a slave subrack is powered off, the FIU board in the slave subrack cannot report alarms to the main control board in the master subrack, failing to trigger service rerouting.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 In NE Panel on the NMS, check whether all boards in a slave subrack of an NE on the OCh trail are offline. If all boards in a slave subrack are offline, go to the next step. Step 2 Check the power cable connections and network cable connections of the NE. If there is a cable connection error, remove it. ----End
Conclusion and Suggestion If an ASON service is interrupted because of a rerouting failure, first check for abnormal alarms on the network. Then check whether all boards in a slave subrack that the service
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
81
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
traverses are offline in NE Panel. Lastly, pinpoint the root cause based on the abnormal alarms.
4.2.5 Case 5: An ASON Service Is Frequently Rerouted Among Multiple Trails Fault Description An ASON service is frequently rerouted among multiple trails although no fiber cut occurs.
Network Topology The following figure shows four points for monitoring channel alarms:
Point A monitors client-side alarms along the direction of the signal flow.
Point D monitors client-side alarms in the reverse direction.
Point B monitors WDM-side alarms along the direction of the signal flow.
Point C monitors WDM-side alarms in the reverse direction.
Cause Analysis When multiple service trails have channel-level faults, an ASON service is rerouted from its original trail where channel alarms are generated to a new trail. Then the alarms on the original trail are cleared. Channel alarms, however, are generated on the new trail, triggering service rerouting the second time. The service may be rerouted to its original trail (because the original trail is optimal). When this occurs, service rerouting is triggered again. Then the service is frequently rerouted among multiple trails.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Lock the rerouting function of the ASON service. Step 2 Check for traditional alarms on the original trail of the ASON service and pinpoint the root cause. Step 3 Check for traditional alarms on the preset restoration trail (for automatic routing) of the ASON service and pinpoint the root cause. ----End
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
82
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Conclusion and Suggestion This case is a typical channel fault troubleshooting case. If no fiber cut occurs and noisome service is frequently rerouted among multiple trails, handle the fault by referring to the channel alarm troubleshooting guide.
4.2.6 Case 6: An Optical ASON Service Fails to Be Automatically Restored in Case of a Wavelength-Level Fault Fault Description After a pigtail or board (an optical, OTU, or regeneration board) fault occurs at a site, an optical ASON service over a specific wavelength is interrupted and cannot be automatically recovered. In addition, the alarm indicating service interruption persists.
Network Topology N/A
Cause Analysis In case of a wavelength-level fault, such as a pigtail or board (an optical, OTU, or regeneration board) fault, the ASON software monitors the optical layer from two aspects: lines (OMS TE links) and OTU boards. See the following figure.
The ASON software cannot detect wavelength-level faults and therefore does not initiate the automatic service recovery process.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Verify that a wavelength-level fault occurs and an ASON service is not automatically recovered. Step 2 Locate the fault by checking performance and alarms on boards on the trail that the service traverses. Step 3 Clear traditional alarms. ----End
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
83
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Conclusion and Suggestion WDM products cannot detect wavelength-level faults and therefore the ASON software does not initiate the automatic service recovery process upon a wavelength-level fault.
4.2.7 Case 7: An ASON Service Enabled with Scheduled Reversion Fails to Be Reverted to Its Original Trail After the Scheduled Reversion Time Elapses Fault Description After the scheduled reversion time is specified for a rerouted ASON service that is enabled with scheduled reversion, the service is not reverted back to the original trail after the original trail is restored.
Network Topology N/A
Cause Analysis The ASON software attempts to revert a rerouted ASON service enabled with scheduled reversion to the original trail at the scheduled reversion time. If the original trail fails to be restored within the time, the ASON software no longer attempts to revert the service to the original trail.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Verify that the original trail is restored. Step 2 Specify the scheduled reversion time again. After the specified time elapse, the service is reverted to the original trail. ----End
Conclusion and Suggestion If the ASON software fails to revert a service to the original trail at the scheduled reversion time, it does not revert the service after the time elapses. At this point, users need to specify the scheduled reversion time again.
4.2.8 Case 8: Route Computation Fails When an ASON OCh Service Traverses a Regeneration Board Fault Description The following error message is displayed when an ASON OCh service traverses a regeneration board during service creation, optimization, or rerouting.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
84
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Network TopologyThe following figure shows the network topology.
Cause Analysis ASON OCh services have the following requirements on regeneration boards: 1.
Regeneration boards must be configured only on optical NEs when optical NEs and electrical NEs are separated.
2.
Regeneration boards are classified into two types: unidirectional regeneration boards (such as LSXR and LSXLR) and bidirectional regeneration boards (such as ND2). Unidirectional regeneration boards in two directions of a service trail must be configured in paired slots; otherwise, the ASON software fails to identify the two regeneration boards as a pair and therefore creates incorrect cross-connections or fails to create cross-connections.
3.
The rate of a regeneration board and the type of the optical module on the regeneration board must match those on boards adding or dropping services; otherwise, services are
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
85
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
unavailable. This is because the ASON software selects only regeneration boards whose rate and optical module type match the rates and optical module types of boards that add or drop services. 4.
The FEC type of regeneration boards must be the same as the FEC type of boards adding or dropping services. If the two FEC types are different, services may be unavailable.
5.
The ODUk rate of a regeneration board must be the same as the ODUk rate of boards adding or dropping services. If the two rates are different, for example, a regeneration board is with a lower rate (10.7G) while a board adding or dropping services is with a higher rate (11.1G), services are unavailable.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Verify that the regeneration board for the ASON OCh service is configured on an optical NE. Step 2 If unidirectional regeneration boards are configured on the ASON OCh trail, verify that the unidirectional regeneration boards in the two directions of the trail are configured in paired slots. Step 3 Check whether the rate of the regeneration board and the type of the optical module on the board match those of boards adding or dropping services. If the rate and the optical module type of the regeneration board mismatch those of boards adding or dropping services, replace the regeneration board. Step 4 Check whether the FEC type of the regeneration board is the same as that of boards adding or dropping services. If they are different, specify the same FEC type. Step 5 Check whether the ODUk rate of the regeneration board is the same as that of boards adding or dropping services. If they are different, specify the same ODUk rate. ----End
Conclusion and Suggestion Regeneration boards are planned in the cabinet diagram and fiber connection diagram that are output in the network plan and design phases. Strict compliance with the diagrams during operations can avoid the fault.
4.2.9 Case 9: An ASON OCh Service Is Interrupted but Not Rerouted Fault Description An OCH_SER_INT alarm is generated on an ASON OCh trail, indicating a service interruption.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
86
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Network Topology The following figure shows the network topology.
Cause Analysis An ASON OCh service is rerouted when an FIU board is offline or reports a MUT_LOS alarm. When an ASON OCh service is interrupted but not rerouted, and the cause cannot be found using common methods, check optical power on the trail. The service interruption may result from abnormal optical power on the trail.
Troubleshooting Procedure Use the following steps to diagnose the fault: Step 1 Choose Configuration > WDM ASON > ASON Trail Management from the main menu. In the WDM ASON Trail Management window that is displayed, right-click the interrupted ASON OCh service and choose Query Relevant Optical Power from the shortcut menu. Then export the optical port information of the trail from the NMS.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
87
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Step 2 Among the optical power information, check whether there is a board whose Input Power or Output Power is –60. If such data is found, no light is input to or output by the board, which is abnormal.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
88
Huawei Optical Network Maintenance Reference WDM ASON
4 Typical ASON Service Troubleshooting Cases
Step 3 Locate the abnormal board and node, and check the attenuation of boards on the node along the signal flow. Check for the following issues that may result in the fault: 1. The attenuation of VOAs on OA boards is excessively high. 2. The attenuation of the port on a multiplexer board (RMU9 or WSM9) is excessively high. 3. The attenuation of EVOA boards is excessively high. ----End
Conclusion and Suggestion This fault occurs on a single wavelength. When such a fault occurs, locate the fault along the signal flow by checking optical power.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
89
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5
FAQs
5.1 Operations on the NMS 5.1.1 How to Distinguish Between ASON Services and Traditional Services on the NMS? Two methods are available: 1.
Choose Configuration > WDM ASON > ASON Trail Management from the main menu. The services that are displayed are ASON services.
2.
Choose Service > WDM Trail > Manage WDM Trail from the main menu. Among the trails that are displayed, trails whose WDM ASON Trail is Yes are ASON services; trails whose WDM ASON Trail is No are traditional services.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
90
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.1.2 How to Identify the First and Last Nodes of an ASON Service? Perform the following procedure: Step 1 Choose Configuration > WDM ASON > ASON Trail Management from the main menu. A WDM ASON Trail Management window is displayed.
Step 2 In the displayed service list, NEs in the Source column are the first nodes of services and NEs in the Sink column are last nodes. Or, you can obtain the first and last nodes of a service from the Route View field. In this field, the NE whose arrow is upwards is the first node; the NE whose arrow is downwards is the last node.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
91
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
----End
5.1.3 How to Identify the Original Trail and Preset Restoration Trail of an ASON Service? Perform the following procedure: Step 1 Choose Configuration > WDM ASON > ASON Trail Management from the main menu. A WDM ASON Trail Management window is displayed.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
92
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
Step 2 In the red box shown in the following figure, the Actual Route, Original Route, Associated Route, Preset Restoration Trail 1, and Preset Restoration Trail 2 tabs are displayed.
----End
5.1.4 How to Manually Optimize an ASON Service? Navigate to the Optimize Route window. In the right pane of the window, right-click the NE that needs to be set as an explicit or excluded node and choose the related item from the shortcut menu. You can set the NE as an explicit node and the link where the NE is located as an explicit link; you can also set the NE as the excluded node and the link where the NE is located as an excluded link. Verify route information and click Apply.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
93
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.1.5 How to Manually Optimize ASON Services in Batches? Navigate to the WDM ASON Trail Management window. Select the target ASON services and click Maintenance. Then choose Optimize Route from the drop-down list.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
94
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.1.6 How to Change a Wavelength to Optimize Trails? Navigate to the Optimize Route window. In the right pane of the window, right-click an explicit node of the service and choose Set Explicit Link > Browse from the shortcut menu. In the dialog box that is displayed, select a required wavelength. In the Optimize Route window, click Pre-Calculate. If the pre-calculation results meet requirements, click Apply.
5.1.7 How to Obtain Fiber Connections of Boards that an ASON Service Traverses? Navigate to the WDM ASON Trail Management window. Select the target service and refresh the related signal flow diagram. Boards and fiber connections are displayed in the diagram.
5.1.8 How to Quickly Locate the Board Where an OCH_SER_INT Alarm Is Generated? The current alarm mechanism cannot help you quickly locate the board where an OCH_SER_INT alarm is generated. You can perform the following operations to locate the board: In the WDM ASON Trail Management window, search for the service where an OCH_SER_INT alarm is generated. Right-click the service, and choose Query Relevant Client Layer Trails from the shortcut menu. In the Manage WDM Trail window that is displayed, locate the malfunctioning board according to the signal flow diagram.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
95
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.1.9 How to Quickly Query the Current Trail or Preset Restoration Trail of an ASON Service that Traverses a Specific NE or Board? On the NMS, you can perform the following operations to query the current trails or original trails of all ASON services that traverse a specific NE or board: Choose Configuration > WDM ASON > WDM ASON Trail Management from the main menu. In the WDM ASON Trail Management window, click Filter. In the Route Type area in the window that is displayed, set the related parameters.
On the NMS, however, you cannot query the preset restoration trails of ASON services that traverse a specific NE or board.
5.1.10 How to Check Whether a Preset Restoration Trail Is Available? Two methods are available:
Directly viewing the TE links where the preset restoration trail traverses: In the WDM ASON Trail Management window, select a service to view the preset restoration trail, and check whether all the TE links where the preset restoration trail traverses are available. The preset restoration trail is unavailable when a TE link is interrupted. If no TE link is interrupted, the preset restoration trail is available.
Using the service optimization function: In the WDM ASON Trail Management window, right-click a service and choose Optimize from the shortcut menu. Specify the preset restoration trail as the explicit trail, and pre-compute service optimization. If the pre-computation succeeds, the preset restoration trail is available; otherwise, the preset restoration trail is unavailable.
5.1.11 How to Check Whether an Optical Cross-Connection Is Successfully Created for an ASON Service? Two methods are available:
Check whether the trail of the ASON service is correct in the WDM ASON Trail Management window
Choose Configuration > Optical Cross-Connection Management in the navigation tree in the NE Explorer of the NMS
5.1.12 How to Quickly Restore an Interrupted Service? If rerouting succeeds but a service is still unavailable, an OTUk_LOF or R_LOS alarm is generated on an OTU board. The possible cause is that the system performance of the new trail is poor or the new trail is not commissioned. In this scenario, groom the interrupted service to another available trail by manually optimizing the service or reverting the service to the original trails. If the service is carried over two associated silver LSPs, you can switch the
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
96
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
service to the other LSP. If you cannot restore the service on the NMS, you can disable the main optical path and the laser at the transmit end of the SC2 board. You can also remove the pigtail from the line side of the FIU board so that a MUT_LOS alarm is reported on the FIU board. The MUT_LOS alarm triggers ASON service rerouting and then you can select a new service trail.
5.1.13 How to Quickly Create Fiber Connections Between Sites? In an ASON domain, the control plane implements automatic link discovery and resource management for fibers between sites. After automatic discovery is completed, you can create fiber connections as follows: Choose Configuration > WDM ASON > WDM TE Link Management from the main menu. In the WDM TE Link Management window, choose Maintenance > Create Fiber.
5.2 Configuration Rules 5.2.1 What Are Common Attributes and Recommended Configurations for ASON Services? The common attributes and recommended configurations for ASON services are listed as follows:
Service revertive attribute: scheduled reversion
Service rerouting policy attribute: best-effort reuse or best-effort separation
Rerouting policy attribute of diamond services: permanent diamond service
5.2.2 What Are the Risks if ODU0, ODU1, and ODU2 ASON Services Are Concurrently Configured? In automatic mode, network resource distribution policies become complex when multi-granularity services (that is, ODUk, where K can be 0, 1, or 2), are configured. Small-granularity services may discontinuously occupy bandwidth for large-granularity services, which affects the survivability of large-granularity services.
5.2.3 What Are the Basic Rules for Configuring Preset Restoration Trails? The basic rules for configuring preset restoration trails are as follows: 1. Preset restoration trails must be planned to ensure that end-to-end service performance such as OSNR satisfies requirements. 2. If possible, you are advised to configure two preset restoration trails for each service.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
97
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.2.4 What Are the Recommended Configurations for the Preset Restoration Trails and Revertive Attributes of SDH/OTN/WDM ASON Services? You must configure preset restoration trails and enable the automatic reversion function for SDH/OTN ASON services, and configure preset restoration trails and enable the scheduled reversion function for WDM ASON services. For an ASON revertive service, its original trail must be reserved after the service is rerouted, so that the service can be successfully reverted to the original trail. 1.
For electrical cross-connections, configure SNCP with the dual feeding and selective receiving function for electrical cross-connections to ensure that ASON services are automatically reverted and the service performance is reliable.
2.
For optical cross-connections on optical paths, the dual feeding and selective receiving function is restricted. To create cross-connections at service adding or dropping sites or regeneration sites, you must delete the original cross-connections. In addition, optical-layer services occupy large bandwidth and service trails cannot be frequently switched until the network is stable. Therefore, manual reversion is recommended for optical-layer services.
5.2.5 What Are the Restrictions on Regeneration Boards When Optical NEs and Electrical NEs Are Separated and Why? When optical NEs and electrical NEs are separated, regeneration boards must be configured only on optical NEs. This is because services traversing regeneration boards are under end-to-end management. The control route on an end-to-end service trail cannot traverse the same NE twice, which forms a loop. For example, the route Optical NE1 --> Electrical NE2 --> Optical NE1 has a loop. See the figure below.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
98
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.2.6 How to Ensure the Rerouting Function When No OA Board Is Configured Between an FIU Board and a WSS Board? OPA does not apply to this scenario. You need to manually adjust optical power. If no OA board is configured between an FIU and a WSS board, the OPA function must be disabled. Therefore, you must manually adjust the VOA of the WSS board in advance to ensure that services are not interrupted after rerouting.
5.2.7 Why Are TN52SCC Boards Instead of TN51SCC or TN11SCC Boards Recommended for ASON NEs? The ASON software provides the dynamic switching function. With this function, resources and services are changed in real time and are updated dynamically, and operation of the ASON software occupies system resources. Therefore, SCC boards with higher performance, such as TN52SCC or TNK2SCC boards, are required to ensure proper running of the ASON software.
5.2.8 What Are the Rules for Configuring Node IDs for ASON NEs? Before the ASON feature is enabled on an NE, you must set the node ID for the NE because the node ID is the unique identifier of an NE on the control plane. Comply with the following rules when configuring the node IDs of ASON NEs:
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
99
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
Ensure that the node ID is unique in one ASON domain.
Ensure that the node ID of an ASON NE is in the format of x.x.x.x (x ranges from 1 to 254), and cannot be in the same network segment as the IP address and OSPF IP address of the NE.
Configure the node ID for an NE before enabling the ASON feature on the NE.
Do not change node IDs after a network is in use. If you need to change node IDs, ensure that there are no ASON services.
The recommended network segments for node IDs are as follows: 172.16.0.0 to 172.31.255.255 (172.16/12 prefix) 10.0.0.0 to 10.255.255.255 (10/8 prefix) 192.168.0.0 to 192.168.255.255 (192.168/16 prefix) Node IDs and IP addresses must be in the same format, but a node ID cannot be 0.0.0.0, 1.2.3.4, or 255.255.255.255.
5.2.9 Why Must the Node ID and IP Address of an NE Be in Different Network Segments? A node ID of an NE is the IP address configured for control plane protocols such as the OSPF. Similar to the IP address of the NE, the node ID must also be unique on the NE. The node ID and the IP address of the NE cannot be configured in a same network segment; otherwise, the control plane protocols may run abnormally.
5.2.10 Why Does the LMP or OSPF Protocol Need to Be Disabled on Electrical Links on OTU Boards Adding or Dropping WDM ASON Services? For ASON services, logical port information is generated by default on the links of the FIU or OTU board, and the link verification function is enabled by default. Therefore, on a network for which only optical-layer ASON is enabled, the LMP or OSPF protocol needs to be disabled on electrical links on boards adding or dropping services. This helps minimize the impact on the network caused by ODUk link verification and flooding. You can disable the LMP protocol as follows: Choose ASON > Advanced Maintenance in the navigation tree, click the LMP Protocol Status tab, and set LMP Protocol Status to Disabled. You can disable the OSPF protocol in a similar way. For details, see the figures below.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
100
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
You are advised to disable the electrical-layer ASON feature on the network for which only optical-layer ASON is enabled. The procedure is as follows: In the WDM ASON Topology Management window, click the Enable Electrical-Layer ASON Feature tab. On the tab, select No from the Enable Electrical-Layer ASON Feature drop-down list for the required NE.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
101
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.2.11 What Are the Application Scenarios and Configuration Method for Resource Reservation? Configuration Method You can configure resource reservation as follows: In the NE Explorer, select the required board, and choose ASON > Resource Reservation Management in the navigation tree. In the window that is displayed, set Resource Reservation to Disabled.
Application Scenarios Scenario 1: Some wavelengths on a link are configured for carrying static services, but no static services are added currently. The wavelengths must be reserved to prevent them from being used by ASON services. Scenario 2: When ASON is enabled in multiple domains such as SDH, OTN, and WDM ASON domains, some resources need to be reserved to ensure that resources used by ASON services at multiple layers are independent of each other and therefore avoid association between ASON services at multiple layers.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
102
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.2.12 Why Does the LMP Protocol Need to Be Disabled for Optical Ports on Tributary Boards? The LMP protocol applies to adjacent nodes to manage fiber connections between the nodes and achieve automatic discovery and management of link resources. Tributary boards are used on the client side, and do not require the automatic discovery function. To avoid unnecessary link verification, the LMP protocol at optical ports on tributary boards needs to be disabled.
5.2.13 How to Disable the LMP Protocol for the Optical Ports that Are Not Used by ASON Services? You can disable the LMP protocol to avoid unnecessary link verification on an ASON network. Perform the following operations to disable the LMP protocol: In the NE Explorer, choose ASON > Advanced Maintenance in the navigation tree, click the LMP Protocol Status tab, and select Disabled from the LMP Protocol Status drop-down list.
NOTE
On an optical-layer ASON network, you are advised to enable the LMP protocol for optical ports on the FIU board and disable the LMP protocol for optical ports on other boards. On an electrical-layer ASON network, you are advised to enable the LMP protocol for optical ports on the OTU boards adding or dropping ASON services and disable the LMP protocol for optical ports on other boards.
5.2.14 How to Disable the OSPF Protocol for the Optical Ports that Are Not Used by ASON Services? On a network where only optical-layer ASON is applied, you can disable the OSPF protocol for optical ports of boards at the electrical layer to lighten NE load. In other words, you can disable the OSPF protocol for the optical ports on OTU boards. Perform the following operations to disable the OSPF protocol: In the NE Explorer, choose ASON > Advanced Maintenance, click the OSPF Protocol Status tab, and select Disabled from the OSPF Protocol Status drop-down list.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
103
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.2.15 How to Split a Large DCN Subnet into Smaller DCN Subnets? There is a limit on the number of NEs (physical NEs other than logical NEs) for a DCN subnet. The following provides the limits in different situations:
If all the NEs on a DCN subnet are OptiX OSN 8800 V100R002, OptiX OSN 6800 V100R004C04, or later versions, a maximum of 200 NEs is allowed when the ASON protocol is disabled and a maximum of 100 NEs is allowed when the ASON protocol is enabled for all the NEs on the DCN subnet.
When some NEs on a DCN subnet are OptiX OSN 8800 V100R001, OptiX OSN 6800 V100R004C02, or earlier versions, a maximum of 100 NEs is allowed, regardless of whether the ASON protocol is enabled.
Multiple GNEs can be configured on a DCN subnet. The GNEs can share the traffic between non-GNEs and the NMS. The non-GNEs under each GNE are specified manually by configuring the GNE as the primary GNE of them. A GNE can connect to at most 60 non-GNEs (50 non-GNEs recommended). If there are more than 60 non-GNEs, another GNE must be configured. The non-GNEs refer to equivalent NEs. When the number of NEs on a DCN subnet exceeds the upper limit, the DCN subnet must be split into smaller DCN subnets. Two methods are available for splitting a DCN subnet: horizontal split and vertical split. In the horizontal split method, a DCN subnet is split based on the service domains to which NEs belong. NEs in a WDM domain can be grouped into a subnet, and NEs in an SDH domain can be grouped into a subnet. In the vertical split method, a DCN subnet is split based on the physical locations of NEs or based on the network topology, regardless of whether NEs belong to the same service domain. Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
104
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.2.16 Do Diamond, Gold, Silver, and Copper ASON Services Support Hitless Conversion? Optical-layer ASON services do not support hitless conversion. ODUk electrical-layer ASON services support hitless conversion between diamond services, silver services, and copper services.
5.3 ASON Principles 5.3.1 What Overheads Are Used by Control Channels on the Control Plane? For WDM optical-layer ASON services, the control channels on the control plane use D4–D12 bytes of the OSC board. For WDM electrical-layer ASON services, the control channels on the control plane use highest-order ODUk RES overheads of the OTU board.
5.3.2 What Is the Difference Between the Menu Items "Revert To Port" and "Revert to Channel"?
To revert an electrical-layer ASON service to its original trail, you can choose either Revert To Port or Revert To Channel. To revert an optical-layer ASON service to its original trail, you can choose only Revert To Wavelength. Revert To Port: If you choose Revert To Port, the port of the new trail is the same as the port of the original trail, but the channels may be different. Revert To Channel: If you choose Revert To Channel, the ports and channels of the new trail are the same as those of the original trail. Revert To Wavelength: If you choose Revert To Wavelength, the ports and wavelengths of the new trail are the same as those of the original trail.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
105
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
You are advised to choose Revert To Channel to revert an electrical-layer ASON service to its original trail and choose Revert To Wavelength to revert an optical-layer ASON service to its original trail.
5.3.3 What Is the Difference Between Trail Overlap and Trail Sharing? Trail overlap can be classified into node overlap, link overlap, and SRLG overlap. They are described as follows: Node overlap: Two ASON trails traverse the same node on a network. Link overlap: Two ASON trails traverse the same link on a network. SRLG overlap: Two ASON trails traverse different links but the links belong to the same SRLG. Trail sharing indicates that the working and protection routes of associated services and diamond services traverse the same node, link, and channel. In other words, the working routes use resources for the protection routes or the protection routes use the resources for the working route during service rerouting.
5.3.4 What Is Associated Sharing? When resources are limited, associated services share the same timeslots so that end-to-end services can be successfully created to ensure service survivability.
5.3.5 What Is the Relationship Between SRLGs and Associated Services? SRLGs are link attributes and they need to be configured on links. You are advised not to choose the links in the same SRLG as the working and protection routes for a diamond service or two associated services. Associated services are ASON services. They can be configured with 1+1 protection, and the working route should be as separate as possible from the protection route.
5.3.6 Why Is a Revertive Service Reverted to the Original Trail 5 Minutes After Rerouting and How to Revert the Service to the Original Trail Within 5 Minutes? A revertive ASON service is rerouted because of a fiber cut on the original trail. The service is reverted to the original trail 5 minutes after the original trail is restored. This 5 minutes is the anti-jitter time that is configured to avoid incorrect service reversion. If you want to revert the service within 5 minutes after faults on the original trail are rectified, you can reconnect the fiber of the original trail, and remove the fiber of the current trail. By doing so, the service can be reverted to the original trail.
5.3.7 Does an OPA Adjust Failure Affect Rerouting of ASON Services? The OPA result does not affect rerouting of ASON services. Instead, rerouting of ASON services is determined by the control plane. After a service trail is successfully created, the
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
106
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
OPA function is triggered to automatically adjust optical power to ensure successful service trail computation. Therefore, even though the rerouting of ASON services is successful, OPA adjustment may fail, and OCh optical paths of ASON services may be unavailable.
5.3.8 Why Cannot Revertive Services Be Downgraded After Rerouting? When a revertive service is rerouted, the resources for the original trail need to be reserved after a new trail is created. If an optical-layer ASON service is downgraded after rerouting, multiple discrete cross-connections will be reserved on the original trail because there are no cross-connections on the first and last nodes. If an electrical-layer ASON service is downgraded after rerouting, SNCP configurations and resources for the original trail will be reserved because SNCP is configured on both the new and original trails.
5.3.9 What Is the Difference Between the Function of Downgrading ASON Services in an NE Explorer and the Function of Downgrading ASON Services in the Trail Management Window? ASON services are downgraded in the trail management window only when end-to-end trails are created for the services. ASON services are downgraded in an NE Explorer when there are discrete services on the NE or the control plane communication of the NE is abnormal. The two functions have the same impact on ASON services. When the NE communication is normal, both functions can be used to downgrade end-to-end ASON services.
5.3.10 Do Service Optimization Must Be Performed at the First Node? Services must be optimized at the first node. For ASON services, all operations except downgrading services must be performed at the first node.
5.3.11 Why Do Services Fail to Be Reverted to the Original Trail? The possible causes are as follows: For non-revertive services, if the fiber of the original trail is not restored, the resources of the original trail will be occupied by other services or discrete cross-connections are found on the original trail. As a result, reverting services to the original trail fails. For revertive services, if the original trail is restored, services can be automatically or manually reverted to the original trail. For optical-layer ASON services, however, the ASON OCh trail may be unavailable after services are automatically reverted to the original trail if the optical power of the original trail changes. Therefore, you are advised to manually revert optical-layer ASON services. In other words, manually check whether the optical power and attenuation of the original trail can satisfy the requirements for reverting services to the original trail after the original trail is restored. If the requirements cannot be satisfied, rectify the fault and then revertive the services to the original trail.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
107
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
5.3.12 Why Does Synchronization Between the NE and NMS Need to Be Performed During Each Query of ASON Service Information? The ASON control plane manages ASON services in distributed manner, and ASON services are restored dynamically. The NMS data and NE data needs to be synchronized during each query of ASON service information so that accurate ASON service information can be obtained.
5.3.13 Does a CPW_XXX_INT Alarm on the Control Plane Mean Service Interruption? A CPW_XXX_SER_INT alarm on the control plane is triggered by alarms generated on the transport plane. When an alarm such as MUT_LOS, R_LOS, R_LOF, or BD_STATUS is generated on the transport plane, a CPW_XXX_SER_INT alarm is reported on the control plane. When a service is interrupted, a CPW_XXX_SER_INT will be generated, but a CPW_XXX_SER_INT does not necessarily mean a service interruption.
5.3.14 What Is the Difference in Database Backup and Restoration Between an ASON Network and a Non-ASON Network? On an ASON network, resources and services are changed dynamically. Therefore, NE databases need to be backed up periodically using the NMS. Before database restoration, ensure that the current backup database on the NMS is the latest database. Then download the database from the NMS for restoration. In addition, ASON services are configured end to end. To ensure that end-to-end service configurations are proper for database restoration, the databases of all NEs in an ASON domain must be backed up at the same time.
5.3.15 Why Do I Need to Periodically Check Whether Rerouted ASON Services Are Reverted to the Original Trails? The original trails of ASON services are planned and designed, and the configurations and performance of the original trails are optimal. If the original trails are available, you are advised to revert services to the original trails.
5.3.16 What Are Residual Cross-Connections and How to Delete Residual Cross-Connections? Residual cross-connections are the cross-connections that are not managed end to end and not used by actual services. Residual cross-connections on a network occupy network resources, which directly affects the provisioning and restoration of ASON services or results in a trail creation failure for ASON services. The control plane provides the function of automatically deleting residual cross-connections; however, the residual cross-connections that are manually configured or generated due to
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
108
Huawei Optical Network Maintenance Reference WDM ASON
5 FAQs
other reasons need to be deleted manually. Therefore, you must delete residual cross-connections in a timely manner.
5.3.17 What Do the CPW_OCH_SER_INT and CPW_ODUk_SER_INT Alarms Mean? The two alarms are generated on the control plane due to fiber cuts or inappropriate optical power. Both alarms indicate that ASON services are interrupted. The CPW_OCH_SER_INT alarm indicates that an optical-layer OCh service is interrupted. The CPW_ODUk_SER_INT alarm indicates that an electrical-layer ODUk service is interrupted.
Issue 02 (2014-08-26)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd
109