WCDMA RAN, Rel. RU40, Operating Documentation, Issue 05 Replacing Multicontroller RNC Hardware Units DN09109953 Issue 02A Approval Date 2014-02-10
Replacing Multicontroller RNC Hardware Units
The information in this document is subject to change without notice and describes only the product defined in the introduction of this documentation. This documentation is intended for the use of Nokia Solutions and Networks customers only for the purposes of the agreement under which the document is submitted, and no part of it may be used, reproduced, modified or transmitted in any form or means without the prior written permission of Nokia Solutions and Networks. The documentation has been prepared to be used by professional and properly trained personnel, and the customer assumes full responsibility when using it. Nokia Solutions and Networks welcomes customer comments as part of the process of continuous development and improvement of the documentation. The information or statements given in this documentation concerning the suitability, capacity, or performance of the mentioned hardware or software products are given "as is" and all liability arising in connection with such hardware or software products shall be defined conclusively and finally in a separate agreement between Nokia Solutions and Networks and the customer. However, Nokia Solutions and Networks has made all reasonable efforts to ensure that the instructions contained in the document are adequate and free of material errors and omissions. Nokia Solutions and Networks will, if deemed necessary by Nokia Solutions and Networks, explain issues which may not be covered by the document. Nokia Solutions and Networks will correct errors in this documentation as soon as possible. IN NO EVENT WILL Nokia Solutions and Networks BE LIABLE FOR ERRORS IN THIS DOCUMENTATION OR FOR ANY DAMAGES, INCLUDING BUT NOT LIMITED TO SPECIAL, DIRECT, INDIRECT, INCIDENTAL OR CONSEQUENTIAL OR ANY LOSSES, SUCH AS BUT NOT LIMITED TO LOSS OF PROFIT, REVENUE, BUSINESS INTERRUPTION, BUSINESS OPPORTUNITY OR DATA,THAT MAY ARISE FROM THE USE OF THIS DOCUMENT OR THE INFORMATION IN IT. This documentation and the product it describes are considered protected by copyrights and other intellectual property rights according to the applicable laws. NSN is a trademark of Nokia Solutions and Networks. Nokia is a registered trademark of Nokia Corporation. Other product names mentioned in this document may be trademarks of their respective owners, and they are mentioned for identification purposes only. Copyright © Nokia Solutions and Networks 2014. All rights reserved
f
Important Notice on Product Safety This product may present safety risks due to laser, electricity, heat, and other sources of danger. Only trained and qualified personnel may install, operate, maintain or otherwise handle this product and only after having carefully read the safety information applicable to this product. The safety information is provided in the Safety Information section in the “Legal, Safety and Environmental Information” part of this document or documentation set.
Nokia Solutions and Networks is continually striving to reduce the adverse environmental effects of its products and services. We would like to encourage you as our customers and users to join us in working towards a cleaner, safer environment. Please recycle product packaging and follow the recommendations for power use and proper disposal of our products and their components. If you should have questions regarding our Environmental Policy or any of the environmental services we offer, please contact us at Nokia Solutions and Networks for any additional information.
2
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Table of Contents This This document has 55 pages Summary of changes..................................................................... changes..................................................................... 6
Issue: 02A
1 1.1 1.1.1
Replacing the faulty chassis in a run running ning system.......................... 7 Replacing a faulty chassis.............................................................. chassis..............................................................7 7 Removing the faulty chassis from the the running system...................7
1.1.1.1 1.1.1.1 1.1.2 1.1.2. 1.1 .2.1 1
Steps ... Steps ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ..... ..7 7 Installing the new chassis............................................................. chassis.............................................................1 11 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 11
2 2.1 2.1.1 2.2 2. 2 2.2.1
Repla Rep laci cin ng th the e hard dis isk k dri rive ve on hard dis isk k driv ive e ca carrrie ierr AM AMC. C... .... .. 16 Removing the faulty hard disk drive............................................. drive.............................................16 16 Steps ........................................................................................... ...........................................................................................17 17 Inst In stal alli ling ng th the e ne new w ha hard rd di disk sk dr driv ive. e... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .. 18 Steps ........................................................................................... ...........................................................................................18 18
3 3.1 3. 1 3.2
Replac Repl acin ing g th the e fa fail iled ed ha hard rd di disk sk dr driv ives es on bo both th CF CFPU PU no node des s .. .... .... ....2 ..23 3 Remo Re movi ving ng th the e fa faul ulty ty ha hard rd di disk sk dr driv ives es.. .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... 23 Installing the new hard disk drives............................................... drives...............................................24 24
4 4.1
Replacing an AMC....................................................................... AMC.......................................................................27 27 Removing an AMC....................................................................... AMC.......................................................................27 27
4.2
Installing an AMC......................................................................... AMC.........................................................................28 28
5 5.1 5. 1 5.1.1 5.1 .1 5.2 5. 2 5.2.1 5.2 .1
Replacing a fan module............................................................... module............................................................... 30 30 Remo Re movi ving ng a fa fan n mo modu dule le.. .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .. 30 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 30 Inst In stal alli ling ng a fa fan n mo modu dule le.. .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .. 30 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 31
6 6.1 6. 1 6.1.1 6.1 .1 6.2 6. 2 6.2.1 6.2 .1
Replac Repl acin ing g an ad addd-in in ca card rd.. .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .. 32 Remo Re movi ving ng an ad addd-in in ca card rd... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .. 32 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 32 Inst In stal alli ling ng an ad addd-in in ca card rd.. ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .. 37 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 37
7
Repl Re plac acin ing g a po powe werr di dist stri ribu buti tion on un unit. it... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... 39
7.1 7. 1
Remo Re movi ving ng a po powe werr di dist stri ribu buti tion on un unit it (P (PDU DU).. ).... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .. 39
7.1.1 7.1.1 7.2 7. 2 7.2.1 7.2 .1
Steps ... Steps ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 40 Inst In stal alli ling ng a po powe werr di dist stri ribu buti tion on un unit it (P (PDU DU).. ).... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .. 40 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 41
8 8.1 8. 1 8.1.1 8.1 .1
Replac Repl acin ing g a po powe werr su supp pply ly un unit. it... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .. 43 Remo Re movi ving ng a po powe werr su supp pply ly un unit. it... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .. 43 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ... 43
DN09109953
3
Replacing Multicontroller RNC Hardware Units
8.2 8.2 8.2.1 8.2 .1
Instal Inst alli ling ng a po powe werr su supp pply ly un unit. it... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... ....44 ..44 Steps Ste ps ... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ......44 ...44
9
Repl Re plac acin ing g th the e ai airr fi filt lter er.. .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... .... .... .. 45
10
Deal De alin ing g wi with th se sens nsor or al alar arms ms.. ..... ..... .... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... .... .... .... .... ..... ..... .... .... .... .... 46
11
Communic Com ica ati tio on betw twe een act ctiv ive e and st sta andb dby y unit its s in a BC BCN N cluster fails................................................................................... 54 Descri Des cripti ption on... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ...... 54 Sympto Sym ptoms. ms.... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ..... .. 54 Recove Rec overy ry pr proce ocedur dures. es.... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ...... ...... ...... ...... ....... ....... ...... ......54 ...54
11.1 11.2 11.3
4
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
List of Figures
Issue: 02A
Figure Figu re 1
The SAS/ SAS/SA SAT TA switc switch h in the HDSA HDSAM-A.. M-A........ ............. ............. ............ ............ ............. ............16 .....16
Figure Figu re 2
The har hard d disk driv drive e on the hard disk driv drive e carr carrier ier AMC... AMC......... ............ ............ ...... 18
Figure Figu re 3
Installing Insta lling a har hard d disk disk driv drive e on the har hard d disk driv drive e carr carrier ier AMC.. AMC........ ........ .. 19
Figure Figu re 4
The SAS/ SAS/SA SAT TA switc switch h in the HDSA HDSAM-A.. M-A........ ............. ............. ............ ............ ............. ............23 .....23
Figure Figu re 5
The har hard d disk driv drive e on the hard disk driv drive e carr carrier ier AMC... AMC......... ............ ............ ...... 24
Figure Figu re 6
Installing Insta lling a har hard d disk disk driv drive e on the har hard d disk driv drive e carr carrier ier AMC.. AMC........ ........ .. 25
Figure Figu re 7
Pulling Pull ing the hot swap han handle dle of an AMC... AMC......... ............ ............ ............. ............. ............ ............27 ......27
Figure Figu re 8
Removing Remo ving an AMC from the BCN mod module.. ule........ ............. ............. ............ ............. ............. ........28 ..28
Figur igure e9
Insertin Inse rting g an AMC into the BCN mod module.. ule......... ............. ............ ............ ............. ............. ............ ......28 28
Figur igure e 10
Pressing Pres sing the hot swap han handle.. dle........ ............. ............. ............ ............ ............. ............. ............ ............ ........ .. 29 29
Figure Figu re 11
BCN top cove coverr scre screws..... ws........... ............ ............ ............. ............. ............ ............ ............. ............. ............ ........... ..... 35
Figur igure e 12
Removing Remo ving the BCN top cove cover..... r............ ............. ............ ............ ............. ............. ............ ............ ............ ......36 36
Figur igure e 13
BCN add add-in -in card scre screws.... ws.......... ............ ............. ............. ............ ............. ............. ............ ............ ............. ......... ..36 36
Figure Figu re 14
Pulling Pull ing an add add-in -in card out from the BCN mod module... ule......... ............ ............ ............. .......... ... 36
Figur igure e 15
Insertin Inse rting g an add add-in -in card into BCN modu modull e..........................................37
Figur igure e 16
BCN add add-in -in card scre screws.... ws.......... ............ ............. ............. ............ ............. ............. ............ ............ ............. ......... ..37 37
Figure Figu re 17
Installing Insta lling BCN top cove cover..... r........... ............ ............. ............. ............ ............. ............. ............ ............ ............. ......... 38
Figur igure e 18 18
Powerr distri Powe distributi bution on unit units s in in the the cabin cabinet.... et.......... ............. ............. ............ ............ ............. ............. ......39 39
Figur igure e 19
Replacin Repl acing g a PDU.. PDU......... ............. ............ ............ ............. ............. ............ ............ ............. ............. ............ ............. ......... ..40 40
Figure Figu re 20
Installing Insta lling PDU to the cabi cabinet.. net........ ............. ............. ............ ............ ............. ............. ............ ............ .......... .... 41
Figur igure e 21
PDU gro groundi unding ng cabl cable.... e........... ............. ............ ............ ............. ............. ............ ............. ............. ............ ............ ...... 41 41
Figur igure e 22
Unscrewin Unscr ewing g the two thum thumbscr bscrews.. ews......... ............. ............ ............. ............. ............ ............ ............. .........45
Figure Figu re 23 23
Opennin Ope nning g the the air air filter filter cove coverr and and pull pulling ing out the air filte filter...... r............. ............. ........ 45
Figur igure e 24 24
Position Posi tions s of of the the PSUs PSUs and fan trays. trays....... ............ ............ ............. ............. ............ ............. ............. ......53 53
DN09109953
5
Summary of changes
Replacing Multicontroller RNC Hardware Units
Summary of changes Changes between document issues are cumulative. Therefore, the latest document issue contains all changes made to previous issues. See Guide to WCDMA RAN and I-HSPA I-HSPA Documentation . Changes made between issues 02 (RU40) and 02A (RU40) Instructions for a graceful shutdown have been added when replacing an add-in card. Changes made between issues 01B (RU30) and 02 (RU40) Instructions apply to both BCN-A and BCN-B hardware. The example display outputs have been updated and may vary slightly as a result. Changes made between issues 01A (RU30) and 01B (RU30) Replacing the hard disk drive on hard disk drive carrier AMC has been updated to include verification steps. has been added.
6
DN09109953
Issue: 02A
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
1 Replacing the faulty chassis in a running system 1.1 Replacing a faulty chassis Purpose In a multi chassis environment (two or more chassis), if the existing chassis is faulty, you need to remove the chassis and replace a new chassis in its place. Before you start Ensure that: • •
1.1.1 1.1.1.1
The embedded software is upgraded to the required version in the replacement chassis. The Initial LMP settings are properly configured for the replacement chassis, such as switch configuration, backplane resiliency configuration. For more information, see Commissioning Multicontroller RNC .
Removing the faulty chassis from the running system Steps
1
Identify Identify the chas chassis sis to to be replace replaced d in the the runni running ng syste system. m. In this section, the chassis to be replaced refers to the chassis-2.
2
Check Check all the the nodes nodes that that are are running running in the chassis chassis to to be replac replaced. ed. To check all the running nodes present in the chassis to be replaced, enter the following command: show hardware state list The following output is displayed: root@CFPU-0 [RNC-89]
> show hardware state list
cabinet-1 : unit /cabinet-1 chassis-1 : unit /cabinet-1/chas sis-1 chassis-2 : unit /cabinet-1/chas sis-2 LMP-1-1-1 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1 s-1/piu-1 LMP-1-2-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1 s-2/piu-1 CFPU-0 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-1 s-1/piu-1/addin-1/CPU/CPU1/core -0,1,10,2,3,4,5,6,7,8,9 CSPU-0 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-2 s-1/piu-1/addin-2/CPU/CPU1/core -0,1,2,3,4,5 USPU-0 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-3 s-1/piu-1/addin-3/CPU/CPU1/core -0,1,2,3,4 EIPU-0 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-4 s-1/piu-1/addin-4/CPU/CPU1/core -0,1,2,3,4,5 CSPU-2 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-5 s-1/piu-1/addin-5/CPU/CPU1/core -0,1,2,3,4,5 USPU-2 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-6 s-1/piu-1/addin-6/CPU/CPU1/core -0,1,2,3,4 USPU-4 : node available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-7 s-1/piu-1/addin-7/CPU/CPU-
Issue: 02A
DN09109953
7
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
1/core EIPU-2 1/core USSR-0 1/core CSUP-0 1/core USUP-0 1/core EITP-0 1/core CSUP-2 1/core USUP-2 1/core USUP-4 1/core EITP-2 1/core CFPU-1 1/core CSPU-1 1/core USPU-1 1/core EIPU-1 1/core CSPU-3 1/core USPU-3 1/core USPU-5 1/core EIPU-3 1/core USSR-1 1/core CSUP-1 1/core USUP-1 1/core EITP-1 1/core CSUP-3 1/core USUP-3 1/core USUP-5 1/core EITP-3 1/core cluster
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
-0,1,2,3,4 : node -0,1,2,3,4,5 : node -11 : node -10,11,6,7,8,9 : node -10,11,5,6,7,8,9 : node -10,11,6,7,8,9 : node -10,11,6,7,8,9 : node -10,11,5,6,7,8,9 : node -10,11,5,6,7,8,9 : node -10,11,6,7,8,9 : node
available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-8 s-1/piu-1/addin-8/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-1 s-1/piu-1/addin-1/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-2 s-1/piu-1/addin-2/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-3 s-1/piu-1/addin-3/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-4 s-1/piu-1/addin-4/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-5 s-1/piu-1/addin-5/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-6 s-1/piu-1/addin-6/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-7 s-1/piu-1/addin-7/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-8 s-1/piu-1/addin-8/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-1 s-2/piu-1/addin-1/CPU/CPU-
: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-2 s-2/piu-1/addin-2/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-3 s-2/piu-1/addin-3/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-4 s-2/piu-1/addin-4/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-5 s-2/piu-1/addin-5/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-6 s-2/piu-1/addin-6/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-7 s-2/piu-1/addin-7/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-8 s-2/piu-1/addin-8/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-1 s-2/piu-1/addin-1/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-2 s-2/piu-1/addin-2/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-3 s-2/piu-1/addin-3/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-4 s-2/piu-1/addin-4/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-5 s-2/piu-1/addin-5/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-6 s-2/piu-1/addin-6/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-7 s-2/piu-1/addin-7/CPU/CPU: node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-8 s-2/piu-1/addin-8/CPU/CPU: cluster available
The output provides information that nodes CFPU-1, CSPU-1, USPU-1, EIPU-1, CSPU-3, USPU-3, USPU-5, EIPU-3, USSR-1, CSUP-1, USUP-1, EITP-1, CSUP-3, USUP-3, USUP-5, EITP-3 are present in the chassis to be replaced.
8
DN09109953
Issue: 02A
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
3
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
Check Check if the curre current nt SCLI SCLI session session is runnin running g on a node node located located in the the chassi chassis s to be replaced. If the SCLI session is running on a node (node name identified by the prompt) located in the chassis to be replaced. Then, perform a switchover for the /SSH recovery group in order to have the connectivity to the cluster during chassis replacement. To perform a switchover for the /SSH recovery group, enter the following command: set has switchover force managed-object /SSH
g
The SSH connection breaks when the swichover command is executed, and the SSH session must be started again.
4
Disable Disable cluste clusterr manager manager nodes nodes locat located ed in the the chassis chassis to be be replace replaced. d. To identify the nodes configured as a cluster manager, enter the following command: show has view managed-object /ClusterHA
The following output is displayed: /ClusterHA: RecoveryGroup /ClusterHA specialConstraints=(serviceInterruptionDenied) RecoveryUnit /CFPU-0/FSClusterHAServer recoveryUnitType=(ClusterManagerRecoveryUnit) Process /CFPU-0/FSClusterHAServer/HASClusterManager command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager) status=(nonHA) startMethod=(always) severity=(important) RecoveryUnit /CFPU-1/FSClusterHAServer recoveryUnitType=(ClusterManagerRecoveryUnit) Process /CFPU-1/FSClusterHAServer/HASClusterManager command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager) status=(nonHA) startMethod=(always) severity=(important) The output indicates that /CluserHA recovery group has recovery units for CFPU-0
and CFPU-1 nodes. Hence, the CFPU-0 and CFPU-1 nodes are configured as cluster manager nodes. From the output of step 4, it is observed that one cluster manager node (CFPU-1) is located in the chassis to be replaced. Hence, the following steps must be executed: a) Disable the Cluster Cluster Management Management Functionality Functionality (CMF) on the CFPU-1 CFPU-1 node. Enter Enter the following command: set cmf disable node-name /CFPU-1
The following output is displayed: Cluster management functionality disabled on host CFPU-1
b) Check CFPU-1 CFPU-1 node node where where CMF was disabled disabled has has CMF-DISABLED status and the other cluster manager node ( in this case CFPU-0 node) has CMF-SERVING status. Enter the following command: show cmf status node-name /CFPU-1
The following output must be displayed: CFPU-1: CMF-DISABLED CFPU-0: CMF-SERVING
Issue: 02A
priority: 6 priority: 5
DN09109953
9
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
5
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
Lock all the the manag managed ed object objects s in the the chass chassis is to be replac replaced. ed. To lock all the managed objects, enter the following command: set has lock managed-object ...
g
The SE nodes are not managed and therefore they are not locked. Example To lock all the managed nodes in chassis-2, enter the following commands: set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
If the nodes are successfully locked, the following output is displayed: root@CFPU-0 [RNC-1002] > set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3 /CFPU-1 locked successfully, 1 services activated on standby node(s) /CSPU-1 locked successfully, 1 services activated on standby node(s) /USPU-1 locked successfully /EIPU-1 locked successfully, 3 services activated on standby node(s) /CSPU-3 locked successfully, 1 services activated on standby node(s) /USPU-3 locked successfully /USPU-5 locked successfully /EIPU-3 locked successfully, 3 services activated on standby node(s)
6
Power Power off all all the managed managed objec objects ts in the the chassi chassis s to be be replace replaced. d. To power off the managed objects, enter the following command: set has power off managed-object ...
Example To power off all the managed nodes in chassis-2, enter the following commands: set has power off managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
If the nodes are successfully powered off, the following output is displayed: /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
7
is is is is is is is is
powered powered powered powered powered powered powered powered
OFF OFF OFF OFF OFF OFF OFF OFF
successfully successfully successfully successfully successfully successfully successfully successfully
Verify erify all the manage managed d objects objects in the the chassis chassis to be replac replaced ed are powere powered d off. To verify the availability of a managed node, enter the following command: show has state availability managed-object
Example To verify that the availability status of all managed nodes in chassis-2 are availability (POWEROFF), enter the following commands: show has state availability managed-object /CFPU-1 /CSPU-1 \ /CSPU-3 /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5
10
DN09109953
Issue: 02A
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
The following output must be displayed: OBJECT /CFPU-1 /CSPU-1 /CSPU-3 /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5
8
AVAILABILITY POWEROFF POWEROFF POWEROFF POWEROFF POWEROFF POWEROFF POWEROFF POWEROFF
Disconne Disconnect ct all the the cables cables connec connected ted to the the chassis chassis to to be replace replaced. d. To disconnect the cables connected to the chassis to be replaced, follow these steps: a) If there is PDU used used with the BCN module, Then Switch off the circuit breaker on the PDU for the BCN module in question. b) Disconnect the power power feed cables. c) Disconnect all all the cables from from the transceivers transceivers on the the front side of of the chassis. d) Disconnect the BCN grounding cable. e) Keep the network cables attached to the front cable tray of the chassis. f) Uninstall the cable cable tray with attached attached network network cables cables from the BCN module. The cable tray is uninstalled by unscrewing the two thumbscrews fixing the cable tray to the BCN module. If the screws are too tight to be opened by hand, loosening the screws that fix the BCN module mounting flanges to the cabinet might help. For more information about detaching the cable tray, refer to the document Installing BCN Modules to the IR206 Cabinet . g) Move the cable tray with attached network cables under the module, so themodule can be easily pulled out from the rack.
9
Remove Remove the the HDD HDD AMC AMC from from the the AMC bay bay. If the chassis has an AMC slot remove the HDD AMC, follow the instructions in the section, Replacing an AMC. AMC .
10 Remove the the chassis chassis to be replaced replaced from the rack. rack.
1.1.2 1.1.2.1
Installing the new chassis Steps
1
Insert Insert the new new cha chass ssis is in in the the rack. rack.
2
Insert Insert the AMC AMC back back in in the the AMC bay bay. If the chassis had an HDD AMC equipped in the AMC bay, then insert the removed HDD AMC into the same slot of the replacement chassis. Follow the instructions in the section, Replacing an AMC. AMC .
Issue: 02A
DN09109953
11
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
3
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
Conne Connect ct all all the the cable cables s to the the new new cha chassi ssis. s. a) Install the cable cable tray with attached attached network network cables back back to the BCN module. module. For more information about the cable tray installation, check the document Installing BCN Modules to the IR206 Cabinet . b) Connect the BCN grounding cable. cable . c) Connect the network cables back to the transceivers on the front side of themodule. d) Connect the power feed cables. e) If there is PDU used used with the BCN module, Then Switch on the circuit breaker on the PDU for the BCN module in question.
4
Check Check that that the LMP and and all node nodes s of the the new chas chassis sis are are availa available. ble. To check that the LMP and all nodes of the new chassis are available, enter the following command: show hardware state list
The following output is displayed: root@CFPU-0 [RNC-89]
> show hardware state list
cabinet-1 : unit chassis-1 : unit chassis-2 : unit LMP-1-1-1 : node available LMP-1-2-1 : node available CFPU-0 : node available 1/core -0,1,10,2,3,4,5,6,7,8,9 CSPU-0 : node available 1/core -0,1,2,3,4,5 USPU-0 : node available 1/core -0,1,2,3,4 EIPU-0 : node available 1/core -0,1,2,3,4,5 CSPU-2 : node available 1/core -0,1,2,3,4,5 USPU-2 : node available 1/core -0,1,2,3,4 USPU-4 : node available 1/core -0,1,2,3,4 EIPU-2 : node available 1/core -0,1,2,3,4,5 USSR-0 : node available 1/core -11 CSUP-0 : node available 1/core -10,11,6,7,8,9 USUP-0 : node available 1/core -10,11,5,6,7,8,9 EITP-0 : node available 1/core -10,11,6,7,8,9 CSUP-2 : node available 1/core -10,11,6,7,8,9 USUP-2 : node available 1/core -10,11,5,6,7,8,9 USUP-4 : node available 1/core -10,11,5,6,7,8,9 EITP-2 : node available
12
DN09109953
/cabinet-1 /cabinet-1/chassis-1 /cabinet-1/chassi s-1 /cabinet-1/chassis-2 /cabinet-1/chassi s-2 /cabinet-1/chassis-1/piu-1 /cabinet-1/chassi s-1/piu-1 /cabinet-1/chassis-2/piu-1 /cabinet-1/chassi s-2/piu-1 /cabinet-1/chassis-1/piu-1/addin-1 /cabinet-1/chassi s-1/piu-1/addin-1/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-2 /cabinet-1/chassi s-1/piu-1/addin-2/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-3 /cabinet-1/chassi s-1/piu-1/addin-3/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-4 /cabinet-1/chassi s-1/piu-1/addin-4/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-5 /cabinet-1/chassi s-1/piu-1/addin-5/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-6 /cabinet-1/chassi s-1/piu-1/addin-6/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-7 /cabinet-1/chassi s-1/piu-1/addin-7/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-8 /cabinet-1/chassi s-1/piu-1/addin-8/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-1 /cabinet-1/chassi s-1/piu-1/addin-1/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-2 /cabinet-1/chassi s-1/piu-1/addin-2/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-3 /cabinet-1/chassi s-1/piu-1/addin-3/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-4 /cabinet-1/chassi s-1/piu-1/addin-4/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-5 /cabinet-1/chassi s-1/piu-1/addin-5/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-6 /cabinet-1/chassi s-1/piu-1/addin-6/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-7 /cabinet-1/chassi s-1/piu-1/addin-7/CPU/CPU/cabinet-1/chassis-1/piu-1/addin-8 /cabinet-1/chassi s-1/piu-1/addin-8/CPU/CPU-
Issue: 02A
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
1/core -10,11,6,7,8,9 CFPU-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-1 s-2/piu-1/addin-1/CPU/CPU1/core CSPU-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-2 s-2/piu-1/addin-2/CPU/CPU1/core USPU-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-3 s-2/piu-1/addin-3/CPU/CPU1/core EIPU-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-4 s-2/piu-1/addin-4/CPU/CPU1/core CSPU-3 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-5 s-2/piu-1/addin-5/CPU/CPU1/core USPU-3 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-6 s-2/piu-1/addin-6/CPU/CPU1/core USPU-5 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-7 s-2/piu-1/addin-7/CPU/CPU1/core EIPU-3 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-8 s-2/piu-1/addin-8/CPU/CPU1/core USSR-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-1 s-2/piu-1/addin-1/CPU/CPU1/core CSUP-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-2 s-2/piu-1/addin-2/CPU/CPU1/core USUP-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-3 s-2/piu-1/addin-3/CPU/CPU1/core EITP-1 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-4 s-2/piu-1/addin-4/CPU/CPU1/core CSUP-3 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-5 s-2/piu-1/addin-5/CPU/CPU1/core USUP-3 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-6 s-2/piu-1/addin-6/CPU/CPU1/core USUP-5 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-7 s-2/piu-1/addin-7/CPU/CPU1/core EITP-3 : node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-8 s-2/piu-1/addin-8/CPU/CPU1/core cluster : cluster available
The output must display that all the nodes of the new chassis are now available.
g
The new chassis and its nodes take some time to boot up.
5
Setup Setup the post post conf configur iguratio ation n for the the LMP LMP of the the new chassis chassis.. To setup the post configuration for the LMP of the new chassis, enter the following commands: cd /opt/nokiasiemens/SS_FSetup/bin ./configBCNLmp.py
The following output is displayed: INFO Copy ssh keys to LMPs. INFO Using credential file : /mnt/state/_glo bal/etc/credentia bal/etc/credentials/BCNls/BCNLMP/root.cred INFO Copying /tftpboot/lmp/h osts file to all LMPs. INFO Changing the syslog.conf on all LMPs. INFO Changing the ntp.conf on all LMPs. INFO Configuring port monitor for all lmps. INFO Changing the mch.conf on all LMPs. INFO Removing bcn_sfp module loading from all LMPs. INFO Patching fastpath reset script on all LMPs. INFO Adding node reset init script to all LMPs. INFO Removing PET/SNMP trap configuration on all LMPs INFO Creating LMP configuration backup for automated configuration restore,
Issue: 02A
DN09109953
13
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
this might take up to 5 minutes.
Then exit with the Ctrl+C. 6
Unloc Unlock k all all the the nodes nodes in the the new new chass chassis. is. To unlock all the nodes in the new chassis, enter the following command: set has unlock managed-object ...
Example To unlock all the nodes in chassis-2, enter the following commands: set has unlock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 \ /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
The following output is displayed: /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
7
unlocked unlocked unlocked unlocked unlocked unlocked unlocked unlocked
successfully. successfully. successfully. successfully. successfully. successfully. successfully. successfully.
Check Check that that all the node nodes s in the the new new chass chassis is are are operati operationa onal. l. Wait for the nodes to restart. After the nodes have restarted, wait for the operational state to become OPERATIONAL(ENABLED). Enter the following command to view the operational state of the node: show has state managed-object ...
Example To check that all the nodes in chassis-2 have OPERATIONAL (ENABLED) status, enter the following commands: show has state operational managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
The following output is displayed:
8
OBJECT
OPERATIONAL
/CFPU-1 /CSPU-1 /CSPU-3 /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5
ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
Enable Enable CMF CMF on the node node conf configur igured ed as cluster cluster manag manager er.. To identify the nodes configured as cluster manager, enter the following command: show has view managed-object /ClusterHA
14
DN09109953
Issue: 02A
Repl Replac acin ing g Mult Multic icon ontr trol olle lerr RNC RNC Hard Hardwa ware re Unit Units s
Repl Replac acin ing g the the faul faulty ty chas chassi sis s in a runn runnin ing g syst system em
The following output is displayed: /ClusterHA: RecoveryGroup /ClusterHA specialConstraints=(serviceInterruptionDenied) RecoveryUnit /CFPU-0/FSClusterHAServer recoveryUnitType=(ClusterManagerRecoveryUnit) Process /CFPU-0/FSClusterHAServer/HASClusterManager command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager) status=(nonHA) startMethod=(always) severity=(important) RecoveryUnit /CFPU-1/FSClusterHAServer recoveryUnitType=(ClusterManagerRecoveryUnit) Process /CFPU-1/FSClusterHAServer/HASClusterManager command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager) status=(nonHA) startMethod=(always) severity=(important) The output indicates that /CluserHA recovery group has recovery units for CFPU-0
and CFPU-1 nodes. Hence, the CFPU-0 and CFPU-1 nodes are configured as cluster manager nodes. From the output of step 8, it is observed that one cluster manager node (CFPU-1) is located in the new chassis. Hence, the following steps must be executed: a) Enable the Cluster Cluster Management Management Functionality Functionality (CMF) on the CFPU-1 CFPU-1 node. Enter Enter the following command: set cmf enable node-name /CFPU-1
The following output is displayed: Cluster management functionality enabled on host CFPU-1
b) Check that the CFPU-1 CFPU-1 node where the the CMF was was enabled, enabled, has CMF-BACKUP status and the CFPU-0 node has CMF-SERVING status. Enter the following command: show cmf status node-name /CFPU-1
The following output is displayed: CFPU-1: CMF-BACKUP CFPU-0: CMF-SERVING
Issue: 02A
priority: 6 priority: 5
DN09109953
15
Replacing the hard disk drive on hard disk drive carrier AMC
Replacing Multicontroller RNC Hardware Units
2 Replacing the hard disk drive on hard disk drive carrier AMC Purpose The hard disk drive carrier AMC (HDSAM-A) is delivered with the hard disk drive in place. The hard disk drive should be replaced every 3 to 4 years. You may also need to replace the hard disk drive if it is faulty or if it needs to be upgraded or serviced. Before you start
f
Electrostatic Electrostat ic discharge (ESD) may damage components in the module or other units. Wear an ESD wrist strap or use a corresponding method when handling the units, and do not touch the connector surfaces. The HDSAM-A supports both SAS and SATA hard disk drives and includes a SAS/SATA switch for selecting the disk type. BCN platform supports only SAS hard disk drives, thus always check that the switch is set to SAS, before starting the replacement procedure. Figure 1
The SAS/SATA SAS/SATA switch in the HDSAM-A
HDSAM-A Handle
IPMB-L
Switch MMC 1
SAS
O N
Switch
SATA
2.5” SAS or SATA SATA Drive
r e t p a d a l a c i n a h c e M
5V
Power
12V
r o t c e n n o C C M A
2 X SAS
LEDs
DN0945027
2.1 Removing the faulty hard disk drive
16
DN09109953
Issue: 02A
Replacin Replacing g Multicon Multicontroll troller er RNC Hardware Hardware Units
2.1.1
Replacin Replacing g the hard hard disk drive drive on hard disk drive drive carrier carrier AMC
Steps 1
Log into into the CFPU CFPU node node where where the hard hard disk disk is not not faulty faulty..
2
Lock the node where where the the faulty faulty hard disk drive drive is locat located. ed. To lock the node where the faulty hard disk drive is located, enter the following command: set has lock managed-object
Example set has lock managed-object /CFPU-1
The following output is displayed: /CFPU-1 locked successfully.
3
Power Power off the the node node where where the the faulty faulty hard disk drive drive is locate located. d. To power off the node where the faulty hard disk drive is located, enter the following command: set has power off managed-object Example set has power off managed-object /CFPU-1
The following output is displayed: /CFPU-1 is powered OFF successfully.
4
Remove Remove the the AMC AMC from from the the AMC AMC bay bay. Follow the instructions in section Replacing an AMC. AMC .
5
Place Place the AMC AMC so that that the faulty faulty hard hard disk disk drive drive side side is facing facing down. down. Unscre Unscrew w the four screws on the metal bracket of the AMC module, then turn the module over carefully while holding the hard disk drive.
6
Disco Disconne nnect ct the the fau faulty lty hard hard disk disk driv drive. e. Detach the faulty hard disk drive from the connector by pulling it gently (from right to left in the following figure).
Issue: 02A
DN09109953
17
Replacing the hard disk drive on hard disk drive carrier AMC
Figure 2
Replacing Multicontroller RNC Hardware Units
The hard disk drive on the hard disk drive carrier AMC Hard disk drive
DN0945257
2.2 Installing the new hard disk drive 2.2.1
Steps 1
Connect Connect the the new new hard hard disk drive drive to the the SAS conne connecto ctorr of HDSAM-A HDSAM-A.. Connect the new hard disk drive to the SAS connector in the HDSAM-A by pushing it gently (from left to right in the following figure).
18
DN09109953
Issue: 02A
Replacin Replacing g Multicon Multicontroll troller er RNC Hardware Hardware Units
Figure 3
Replacin Replacing g the hard hard disk drive drive on hard disk drive drive carrier carrier AMC
Installing a hard disk drive on the hard disk drive carrier AMC
Hard disk drive
SAS connector
DN0945245
Inserted screw
2
Turn the the AMC over over and attac attach h the new new hard hard disk drive drive to to the AMC AMC with four four screws. Tighten the screws so that their heads are in line with the metal bracket.
3
Instal Installl the AMC AMC modul module e back back into into the the AMC bay bay.. Follow the instructions in section Replacing an AMC. AMC .
4
Enable Enable networ network k boot boot for the the node node with with the the new hard disk drive. drive. To enable the network boot for the node with the new hard disk drive, enter the following commands: a) Log in as root. root. set user username root
b) Power Power on on the the node. node. hwcli -np on
Wait a few seconds before proceeding to the next step. c) Reset Reset the node. node. hwcli -nr -B 3
d) Exit Exit root. root. exit
Issue: 02A
DN09109953
19
Replacing the hard disk drive on hard disk drive carrier AMC
Replacing Multicontroller RNC Hardware Units
Example: set user username root hwcli -np on CFPU-1
The following ouput is displayed: Powering on
CFPU-1
[ok]
hwcli -nr -B 3 CFPU-1
The following ouput is displayed: Resetting
g
CFPU-1
[ok]
The CFPU node takes some time to reboot and the availability can be checked by logging through the SSH.
5
Disable Disable the the watch watchdog dog on on the node node with with the the new hard disk drive drive.. To disable the watchdog on the node with the new hard disk drive through SSH, enter the following command: ssh \ wdctl -d
Example: ssh CFPU-1 \ wdctl -d exit
6
Initializ Initialize e the new new disk from from the the other other node where where the the hard hard disk is not not faulty faulty.. To initialize the new disk, enter the following command: initialise hw
The following output is displayed: Hardware successfully initialized
g
To run the initialization script and display the console output, the space bar must be pressed several times after entering the command.
7
Reboot Reboot the the node node with with the new new hard hard disk disk drive drive from from the loca locall disk. disk. Enter the following commands: set user username root hwcli -nr -B 2 exit
Example: Enter: set user username root hwcli -nr -B 2 CFPU-1 exit
The following ouput is displayed: Resetting
CFPU-1
[ok]
The node will restart and synchronize the Distributed Replicated Block Devices (DRBD). You can enter the watch -n 10 cat /proc/drbd to see how the synchronization is progressing.However, if the watch -n 10 cat /proc/drbd command fails, the cat /proc/drbd command must be executed.
20
DN09109953
Issue: 02A
Replacin Replacing g Multicon Multicontroll troller er RNC Hardware Hardware Units
g
Replacin Replacing g the hard hard disk drive drive on hard disk drive drive carrier carrier AMC
Set user username root must first be executed before watch -n 10 cat /proc/drbd.
Do not restart the node during the DRBD synchronization. The initialization process of the new disk is not ready until the synchronization is successfully completed. Example: # watch -n 10 cat /proc/drbd Every 10.0s: cat /proc/drbd 24 10:54:39 2013
Wed Apr
version: 8.3.7 (api:88/proto:86-92) srcversion: 35B9BF7C501212268498452 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---ns:512081 nr:0 dw:635 dr:512860 al:4 bm:32 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---ns:104849 nr:0 dw:31303 dr:106531 al:6 bm:7 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---ns:204756 nr:0 dw:36 dr:205264 al:3 bm:13 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---ns:2890764 nr:0 dw:46448 dr:2882497 al:32 bm:190 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12469416 [==>.................] [==>............ .....] sync'ed: 18.9% (12176/14996)M finish: 0:05:37 speed: 36,872 (30,744) K/sec 4: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:16024 nr:0 dw:78448 dr:174701 al:489 bm:200 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:3062668 5: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:533 nr:0 dw:4921 dr:4830 al:13 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:102360 6: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:349 dr:1461 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8152 7: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:12 dr:675 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8152 8: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:133 nr:0 dw:1894 dr:759 al:5 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:511948 9: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:116 dr:4461 al:7 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:2916224 10: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-ns:0 nr:0 dw:1500 dr:956 al:3 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:102360
The example shows that the first blocks are now synchronized. The oos (out of synch) value is zero. For block 3, the synchronization is in process and progress is displayed. Once synchronization is complete, the oos value for all blocks will be 0. 8
Check Check that servin serving g and backup backup CMF CMF (Cluster (Cluster Manag Managemen ementt Function Functionalit ality) y) are working normally. Enter: show cmf status recovery-unit node-name
Issue: 02A
DN09109953
21
Replacing the hard disk drive on hard disk drive carrier AMC
Replacing Multicontroller RNC Hardware Units
Example: _nokadmin@CFPU-0 [RNC-37] > show cmf status recovery-unit node-name /CFPU-1 CFPU-0@RNC-37 [2013-04-24 13:18:51 +0200] Recovery units with DRBD resources for managed object /CFPU-1: /CFPU-1/FSDirectoryServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd1: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSAlarmSystemLightServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd5: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/QNOMUServer-1: 1/1 (peer[s]/drbd device[s] up) /dev/drbd9: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSPM9Server: 1/1 (peer[s]/drbd device[s] up) /dev/drbd8: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSClusterStateServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd4: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSLogServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd3: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSSSHServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd2: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/QNEMServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd10: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSCLMServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd6: DRBD_SECONDARY 1/0 (peer/wait secondary) /CFPU-1/FSHotSwapMonitorServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd7: DRBD_SECONDARY 1/0 (peer/wait secondary) _nokadmin@CFPU-0 [RNC-37] > show cmf status recovery-unit node-name /CFPU-0 CFPU-0@RNC-37 [2013-04-24 13:18:55 +0200] Recovery units with DRBD resources for managed object /CFPU-0: /CFPU-0/FSPM9Server: 1/1 (peer[s]/drbd device[s] up) /dev/drbd8: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSClusterStateServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd4: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSLogServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd3: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSSSHServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd2: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/QNEMServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd10: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSCLMServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd6: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSHotSwapMonitorServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd7: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSAlarmSystemLightServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd5: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/FSDirectoryServer: 1/1 (peer[s]/drbd device[s] up) /dev/drbd1: DRBD_PRIMARY 1/0 (peer/wait secondary) /CFPU-0/QNOMUServer-0: 1/1 (peer[s]/drbd device[s] up) /dev/drbd9: DRBD_PRIMARY 1/0 (peer/wait secondary)
Compare the blocks and they should match for both managed objects. 9
Unloc Unlock k the the node node with with the the new new har hard d disk disk drive drive.. Enter: set has unlock managed-object
Example: set has unlock managed-object /CFPU-1
The following output is displayed: /CFPU-1 unlocked successfully.
22
DN09109953
Issue: 02A
Replac Replacing ing Multic Multicon ontro trolle llerr RNC Hardw Hardwar are e Units Units
Repla Replacin cing g the failed failed hard hard disk disk drive drives s on both both CFPU CFPU nodes
3 Replacing the failed hard disk drives on both CFPU nodes Summary The hard disk drive carrier AMC (HDSAM-A) is delivered with the hard disk drive in place. The hard disk drive should be replaced every 3 to 4 years. You may also need to replace the hard disk drives if they are faulty or need to be upgraded or serviced. Purpose To replace the failed hard disk drives on both the CFPU nodes. Before you start
f
Electrostatic Electrostat ic discharge (ESD) may damage components in the module or other units. Wear an ESD wrist strap or use a corresponding method when handling the units, and do not touch the connector surfaces. The HDSAM-A supports both SAS and SATA hard disk drives and includes a SAS/SATA switch for selecting the disk type. BCN platform supports only SAS hard disk drives, thus always check that the switch is set to SAS, before starting the replacement procedure. Figure 4
The SAS/SATA SAS/SATA switch in the HDSAM-A
HDSAM-A Handle
IPMB-L
Switch MMC 1
SAS
O N
Switch
SATA
2.5” SAS or SATA SATA Drive
r e t p a d a l a c i n a h c e M
5V
Power
12V
r o t c e n n o C C M A
2 X SAS
LEDs
DN0945027
3.1 Removing the faulty hard disk drives 1
Remove Remove the the AMC AMC from from the the AMC AMC bay bay. Follow the instructions in section Replacing an AMC. AMC .
Issue: 02A
DN09109953
23
Replacing the failed hard disk drives on both CFPU nodes
Replacing Multicontroller RNC Hardware Units
2
Place Place the AMC AMC so that that the fault faulty y hard disk disk drive drive side side is facing facing down down.. Unscrew Unscrew the four screws on the metal bracket of the AMC module, then turn the module over.
3
Disco Disconne nnect ct the the fau faulty lty hard hard disk disk driv drive. e. Detach the faulty hard disk drive from the connector by pulling it gently (from right to left in the following figure). Figure 5
The hard disk drive on the hard disk drive carrier AMC Hard disk drive
DN0945257
4
Repeat Repeat steps steps 1 to 3 for remov removing ing the the other other hard hard disk disk drive drive..
3.2 Installing the new hard disk drives 1
Connect Connect the the new new hard hard disk drive drive to the the SAS conne connecto ctorr of HDSAM-A HDSAM-A.. Connect the new hard disk drive to the SAS connector in the HDSAM-A by pushing it gently (from left to right in the following figure).
24
DN09109953
Issue: 02A
Replac Replacing ing Multic Multicon ontro trolle llerr RNC Hardw Hardwar are e Units Units
Figure 6
Repla Replacin cing g the failed failed hard hard disk disk drive drives s on both both CFPU CFPU nodes
Installing a hard disk drive on the hard disk drive carrier AMC
Hard disk drive
SAS connector
DN0945245
Inserted screw
2
Turn the the AMC over over and attac attach h the new new hard hard disk drive drive to to the AMC AMC with four four screws. Tighten the screws so that their heads are in line with the metal bracket.
3
Instal Installl the AMC AMC modul module e back back into into the the AMC bay bay.. Follow the instructions in section Replacing an AMC. AMC .
4
Check Check the embedd embedded ed softwa software re version version on on the new new hard disk disk (HDSAM (HDSAM-A). -A). Use the following command: show sw-manage embedded-sw version all
5
Upgrad Upgrade e the the embe embedde dded d softw softwar are e vers version ion.. If there are newer embedded software versions, then upgrade the embedded software version. For instructions, see Upgrading Embedded Software.
Issue: 02A
DN09109953
25
Replacing the failed hard disk drives on both CFPU nodes
Replacing Multicontroller RNC Hardware Units
6
Repeat Repeat the steps steps 1 to to 5 for instal installing ling the the hard hard disk drive drive on on the other other CFPU CFPU node.
7
Perfo Perform rm the the full full rest restora oratio tion n for the the syst system. em. Perform the full restoration for the system. For instructions, see Commissioning mcRNC .
26
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing an AMC
4 Replacing an AMC Purpose You may need to replace an AMC if AMC if it is faulty or if it needs to be replaced due to configuration changes, extensions or servicing.
g
When sending a faulty hard disk drive AMC to be replaced, remember to remove the hard disk drive. Before you start
f
Electrostatic Electrostat ic discharge (ESD) may damage components in the module or other units. Wear an ESD wrist strap or use a corresponding method when handling the units, and do not touch the connector surfaces.
4.1 Removing an AMC 1
Gently Gently pull pull the the hot hot swap swap handl handle e on the front front panel panel of the the AMC. AMC. Do not pull the handle out all the way yet. Pulling the handle notifies the hardware management system that you are going to remove the AMC and tells it to finish all processes. The hot swap LED starts flashing. Figure 7
Pulling the hot swap handle of an AMC
DN0977767
2
Wait Wait until until the hot swap swap LED LED turns turns into a solid solid blue. blue. This may take a few seconds.
Issue: 02A
DN09109953
27
Replacing an AMC
Replacing Multicontroller RNC Hardware Units
3
Pull the the hot swap swap handle handle again again more more firmly firmly and and slide slide the AMC AMC out of of the bay bay. Figure 8
Removing an AMC from the BCN module
DN0973762
4
If you are are not installi installing ng another another AMC AMC immediate immediately ly,, install install an AMC filler filler into the the empty AMC bay. This is to ensure adequate cooling and a proper EMC shield in the module.
4.2 Installing an AMC 1
Check Check that the the EMC gask gasket et is correc correctly tly in place place and and that that its conta contacts cts are are clean. clean.
2
Insert Insert the AMC AMC into into the bay bay, sliding sliding it along along the guide guide rails rails as as shown shown in the figure below. Make sure that the AMC is firmly seated in the module’s connectors. Figure 9
Inserting an AMC into the BCN module
DN0977588
3
Press Press the hot hot swa swap p han handle dle firmly firmly.. Wait until the blue hot swap LED turns off and the power LED turns solid green.
28
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Figure 10
Replacing an AMC
Pressing the hot swap handle
DN0977782
g
Issue: 02A
If hard disk cross connecting is used, the hard disk AMC can only be placed in AMC bay1.
DN09109953
29
Replacing a fan module
Replacing Multicontroller RNC Hardware Units
5 Replacing a fan module Summary The fan modules are located at the rear of the BCN module. BCN fan modules
DN0973747
Before you start
f
Electrostatic Electrostat ic discharge (ESD) may damage components in the module or other units. Wear an ESD wrist strap or use a corresponding method when handling the units, and do not touch the connector surfaces.
t
The fan module can be replaced while the BCN is powered on. Only one fan module can be replaced at once. Prepare the spare fan unit for replacement beforehand. After removing a fan from the BCN module, the systems starts to heat up very quickly. Proceed immediately with the new fan installation. The following procedure applies to all three fan modules of the BCN module.
5.1 Removing a fan module 5.1.1
Steps 1
Unscrew Unscrew the the two two thumbsc thumbscrew rews s attach attaching ing the the fan modul module e to the the BCN. The Phillips screws are built into the fan module and can be loosened either by hand or with a screwdriver.
2
Pull Pull the the fan fan modul module e out out from from the the BCN BCN modu module. le.
5.2 Installing a fan module
30
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
5.2.1
Replacing a fan module
Steps 1
Insert Insert the fan modu module le to its its slot slot at the the rear rear side of the the BCN modul module. e.
2
Tigh Tighten ten the the fan fan modul module’ e’s s thum thumbs bscre crews ws.. The Phillips screws are built into the fan module and can be tightened either by hand or with a screwdriver.
Issue: 02A
DN09109953
31
Replacing an add-in card
Replacing Multicontroller RNC Hardware Units
6 Replacing an add-in card Before you start Power off the BCN module before removing or installing an add-in card.
f
Electrostatic Electrostat ic discharge (ESD) may damage components in the module or other units. Wear an ESD wrist strap or use a corresponding method when handling the units, and do not touch the connector surfaces.
6.1 Removing an add-in card 6.1.1
Steps 1
Opti Option on Desc Descri ript ptio ion n If
the BCN module is installed in the cabinet,
Then
a) Gracefully shut down the BCN module. 1. Identify the chassis chassis where the the plug-in unit is located that is to be replaced. replaced. In this section, the chassis refers to the chassis-2. 2. Check all the nodes nodes that are are running running in the chassis chassis where the plug-in unit unit is locat To check all the running nodes present in the chassis to be replaced, enter the fol The following output is displayed: root@CFPU-0 [RNC-89] cabinet-1 chassis-1 chassis-2 LMP-1-1-1 LMP-1-2-1 CFPU-0 CSPU-0 USPU-0 EIPU-0 CSPU-2 USPU-2 USPU-4 EIPU-2 USSR-0 CSUP-0 USUP-0 EITP-0 CSUP-2 USUP-2 USUP-4 EITP-2 CFPU-1 CSPU-1 USPU-1 EIPU-1 CSPU-3
32
DN09109953
: : : : : : : : : : : : : : : : : : : : : : : : : :
> show hardware state list unit unit unit node node node node node node node node node node node node node node node node node node node node node node node
/cabinet-1 /cabinet-1/chassis-1 /cabinet-1/chass is-1 /cabinet-1/chassis-2 /cabinet-1/chass is-2 available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1 s-1/piu-1 available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1 s-2/piu-1 available /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-1 s-1/piu-1/addin-1/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-2 s-1/piu-1/addin-2/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-3 s-1/piu-1/addin-3/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-4 s-1/piu-1/addin-4/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-5 s-1/piu-1/addin-5/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-6 s-1/piu-1/addin-6/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-7 s-1/piu-1/addin-7/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-8 s-1/piu-1/addin-8/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-1 s-1/piu-1/addin-1/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-2 s-1/piu-1/addin-2/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-3 s-1/piu-1/addin-3/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-4 s-1/piu-1/addin-4/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-5 s-1/piu-1/addin-5/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-6 s-1/piu-1/addin-6/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-7 s-1/piu-1/addin-7/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-1/piu-1/addin-8 s-1/piu-1/addin-8/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-1 s-2/piu-1/addin-1/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-2 s-2/piu-1/addin-2/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-3 s-2/piu-1/addin-3/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-4 s-2/piu-1/addin-4/CPU/CPUavailable /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-5 s-2/piu-1/addin-5/CPU/CPU-
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing an add-in card
Opti Option on Desc Descri ript ptio ion n USPU-3 USPU-5 EIPU-3 USSR-1 CSUP-1 USUP-1 EITP-1 CSUP-3 USUP-3 USUP-5 EITP-3 cluster
: : : : : : : : : : : :
node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-6 s-2/piu-1/addin-6/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-7 s-2/piu-1/addin-7/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-8 s-2/piu-1/addin-8/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-1 s-2/piu-1/addin-1/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-2 s-2/piu-1/addin-2/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-3 s-2/piu-1/addin-3/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-4 s-2/piu-1/addin-4/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-5 s-2/piu-1/addin-5/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-6 s-2/piu-1/addin-6/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-7 s-2/piu-1/addin-7/C /C node available /cabinet-1/chassi /cabinet-1/chassis-2/piu-1/addin-8 s-2/piu-1/addin-8/C /C cluster available
The output provides information that nodes CFPU-1, CSPU-1, USPU-1, EIPU present in the chassis to be replaced. 3. Check if the current current SCLI session session is running running on a node node located in the chassis. If the SCLI session is running on a node (node name identified by the prompt) connectivity to the cluster during chassis replacement. To perform a switchov set has switchover force managed-object /SSH
g
The SSH connection breaks when the swichover command is exe 4. Disable cluster cluster manager manager nodes located located in the chassis. chassis. To identify the nodes configured as a cluster manager, enter the following co show has view managed-object /ClusterHA
The following output is displayed: /ClusterHA: RecoveryGroup /ClusterHA specialConstraints=(serviceInterruptionDenied) RecoveryUnit /CFPU-0/FSClusterHAServer recoveryUnitType=(ClusterManagerRecoveryUnit) Process /CFPU-0/FSClusterHAServer/HASClusterManager command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager) status=(nonHA) startMethod=(always) severity=(important) RecoveryUnit /CFPU-1/FSClusterHAServer recoveryUnitType=(ClusterManagerRecoveryUnit) Process /CFPU-1/FSClusterHAServer/HASClusterManager command=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager) status=(nonHA) startMethod=(always) severity=(important) The output indicates that /CluserHA recovery group has recovery units for C
From the output of step 4, it is observed that one cluster manager node (CFP a) Disable the Cluster Management Functionality (CMF) on the CFPU-1 nod set cmf disable node-name /CFPU-1
The following output is displayed: Cluster management functionality disabled on host CFPU
b) Check CFPU-1 node where CMF was disabled has CMF-DISABLED statu command:
Issue: 02A
DN09109953
33
Replacing an add-in card
Replacing Multicontroller RNC Hardware Units
Opti Option on Desc Descri ript ptio ion n show cmf status node-name /CFPU-1
The following output must be displayed: CFPU-1: CMF-DISABLED CFPU-0: CMF-SERVING
priority: 6 priority: 5
5. Lock all all the managed objects in the chassis. To lock all the managed objects, enter the following command: set has lock managed-object ...
g
The SE nodes are not managed and therefore they are not locked. Example To lock all the managed nodes in chassis-2, enter the following commands: set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1
If the nodes are successfully locked, the following output is displayed: root@CFPU-0 [RNC-1002] > set has lock managed-object /CFPU-1 /CSPU-1 /USP /CFPU-1 locked successfully, 1 services activated on standby node(s) /CSPU-1 locked successfully, 1 services activated on standby node(s) /USPU-1 locked successfully /EIPU-1 locked successfully, 3 services activated on standby node(s) /CSPU-3 locked successfully, 1 services activated on standby node(s) /USPU-3 locked successfully /USPU-5 locked successfully /EIPU-3 locked successfully, 3 services activated on standby node(s)
6. Power off off all the managed managed objects in the chassis. chassis. To power off the managed objects, enter the following command: set has power off managed-object ...
Example To power off all the managed nodes in chassis-2, enter the following commands: set has power off managed-object /CFPU-1 /CSPU-1 /USPU-1 /E
If the nodes are successfully powered off, the following output is displayed: /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
is is is is is is is is
powered powered powered powered powered powered powered powered
OFF OFF OFF OFF OFF OFF OFF OFF
successfully successfully successfully successfully successfully successfully successfully successfully
7. Verify Verify all the managed managed objects objects in the chassis chassis to be replaced replaced are powered off. off. To verify the availability of a managed node, enter the following command: show has state availability managed-object
Example To verify that the availability status of all managed nodes in chassis-2 are avail show has state availability managed-object /CFPU-1 /CSPU-1
The following output must be displayed: OBJECT
AVAILABILITY
/CFPU-1 POWEROFF /CSPU-1 POWEROFF /CSPU-3 POWEROFF
34
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing an add-in card
Opti Option on Desc Descri ript ptio ion n /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5
b) c) d) e) f) g)
POWEROFF POWEROFF POWEROFF POWEROFF POWEROFF
Disconnect the power feed cables. Disconnect the network cables from the transceivers on the front side of the mod Disconnect the BCN grounding cable.
Keep the network cables attached to the front cable tray of the BCN module. Uninstall the cable tray with attached network cables from the BCN module. The cable tray is uninstalled by unscrewing the two thumbscrews fixing the cable mounting flanges to the cabinet might help. For more information about detaching the cable tray, refer to the document Install h) Move the cable tray with attached network cables under the module, so the modu
2
Unscrew Unscrew the the two thumbs thumbscre crews ws securi securing ng the top top cover cover of the the BCN module module.. The screws are located at the rear side of the module as shown on the figure below. The Phillips screws are built into the top cover of the BCN module and can be loosened either by hand or with a screwdriver. Figure 11
BCN top cover screws
DN0973774
3
Issue: 02A
Option
Description
If
the BCN module is installed in the cabinet,
Then
a) Pull the module out of the cabinet, until it locks into the outmost position. b)
DN09109953
35
Replacing an add-in card
4
Replacing Multicontroller RNC Hardware Units
Slide the the top cove coverr of module module towar towards ds the rear rear side side until until it stops. stops. Lift Lift the top top cover upwards. Figure 12
5
Removing the BCN top cover
Unscrew Unscrew the the two thumbsc thumbscrew rews s securing securing the the add-in add-in card to to the rails rails inside inside the BCN module. Figure 13
BCN add-in card screws
DN0977525
6
Slide the add-i add-in n card card upwards upwards to remove remove it from from the BCN BCN module. module. Figure 14
Pulling an add-in card out from the BCN module
DN0973798
36
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing an add-in card
6.2 Installing an add-in card 6.2.1
Steps 1
Slide the the add-in add-in card card into into the rails rails inside inside the the BCN module module until until the the pins of of the card fall into connectors of the main board. Figure 15
Inserting an add-in card into BCN module
DN0973708
2
Secure Secure the the add-in add-in card card to to the rails rails with with builtbuilt-in in thumbsc thumbscrew rews. s. The Phillips screws are built into the add-in card and can be tightened either by hand or with a screwdriver. Figure 16
BCN add-in card screws
DN0977525
Issue: 02A
DN09109953
37
Replacing an add-in card
3
Replacing Multicontroller RNC Hardware Units
Place Place the BCN BCN module’ module’s s cover cover on the the top of the the module, module, leavin leaving g small small gap between the top cover and the front edge of the module. Figure 17
4
5
6
7
38
Installing BCN top cover
Slide the the top cove coverr to the front front side side of the the module, module, until until it falls falls into into place. place.
Option
Description
If
the BCN module is installed in the cabinet,
Then
a) b) Push the module back into the cabinet, until it locks into position. Pull the green latches on the inner sliding rails towards you and slide the BCN module into the cabinet.
Tigh Tighten ten the the thum thumbsc bscre rews ws of of the top top cove coverr.
Option
Description
If
the BCN module is installed in the cabinet,
Then
a) Install the cable tray tray with attached network network cables back to the BCN module. For more information about the cable tray installation, check the document Installing BCN Modules to the IR206 Cabinet. b) Connect the BCN grounding cable. c) Connect the network cables back to the transceivers on the front side of the module. d) Connect the power feed cables. e) f) Power on the BCN module.
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing a power distribution unit
7 Replacing a power distribution unit Purpose If the power distribution unit (PDU) is faulty, you must replace it with a new one. Power distribution distributi on units in the cabinet
Figure 18
ON
OFF 1
ON
OFF 2
ON
OFF 3
ON
OF F 4 ON
ON
OFF 1
ON
OFF 2
ON
OFF 3
OFF ON
5
ON
OFF 6
ON
OFF 7
ON
OFF 8
OFF 4 ON
OFF 5
DN0960093
ON
OFF 6
ON
OFF 7
ON
OFF 8
front view
Before you start Make sure you have a digital multimeter or voltage meter available.
f
Danger of hazardous voltages and electric shock! Before connecting or removing any power supply cables to or from the power distribution unit, make sure that both site power feeds to the power distribution unit are off, the circuit breakers on the front panel of the power distribution unit are in the OFF position, and the equipment is properly earthed (grounded).
f
Danger of hazardous voltages and electric shock! Make sure your hands are dry and remove any metal objects such as rings before touching the power supply equipment.
f
Risk of personal injury. Observe the given torque ranges at all times. Incorrect torque can result in damage to equipment, unreliability, and fire hazards due to excessive power dissipation and high temperature of materials.
7.1 Removing a power distribution unit (PDU)
Issue: 02A
DN09109953
39
Replacing a power distribution unit
7.1.1
Replacing Multicontroller RNC Hardware Units
Steps 1
Make Make sure sure that the redunda redundant nt PDU is func function tional. al.
2
Switch Switch off off the circu circuit it breake breakers rs on the the PDU you are are going going to to remove. remove.
3
Check Check the PDU PDU input input feeds feeds with a digit digital al multimet multimeter er to ensur ensure e there there are no voltages in the cables.
4
Disco Disconne nnect ct all cable cables s fro from m the the PDU. PDU. a) Disconnect the four power feed feed cables cables from the PDU. b) Disconnect the CGNDB grounding grounding cable from the PDU. c) Disconnect the eight eight PSU input feeds feeds from the PDU.
5
Unscrew Unscrew the the four four fixing fixing screw screws s attachi attaching ng the the PDU to to the cabin cabinet. et. Figure 19
Replacing a PDU
M6 ON
OFF 1
ON
OFF 2
ON
OFF 3
ON
OFF 4 ON
ON
OFF 1
ON
OFF 2
ON
OFF 3
OFF ON
5
ON
OFF 6
ON
OFF 7
ON
OFF 8
OF F 4 ON
OFF 5
ON
OFF 6
ON
OF F 7
ON
OFF 8
front view DN0960109
6
Remo Remove ve the the PDU PDU fro from m the the cabi cabine net. t.
7.2 Installing a power distribution unit (PDU)
40
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
7.2.1
Replacing a power distribution unit
Steps 1
Insert Insert the PDU PDU into the the cabinet cabinet and and align align the holes holes of of its mount mounting ing ear ear with the the cabinet mounting rail.
2
Attach Attach the the PDU PDU to to the the cabine cabinett with with four four M6x12 M6x12 scre screws. ws. Figure 20
Installing PDU to the cabinet
M6 ON
OF F 1
ON
OFF 2
ON
OFF 3
ON
OFF 4 ON
ON
OFF 1
ON
OFF 2
ON
OFF 3
OFF ON
5
ON
OFF 6
ON
OFF 7
ON
OF F 8
OFF 4 ON
OFF 5
ON
OFF 6
ON
OFF 7
ON
OFF 8
front view DN0960187
3
Connect Connect the PDU groun grounding ding cable cable (CGNDB (CGNDB)) to the PDU. Figure 21
PDU grounding cable
- 48
RTN - 4 48 8 7
RTN
5 3
8
1
6 4
2
rear view DN0977591
Issue: 02A
DN09109953
41
Replacing a power distribution unit
42
Replacing Multicontroller RNC Hardware Units
4
Check Check the PDU PDU input input feeds feeds with a digit digital al multimet multimeter er to ensur ensure e there there are no voltages in the cables.
5
Connect Connect the the site powe powerr supply supply cables cables to the the PDU (for (for DC power power supply supply only) only)..
6
Connect Connect the the site powe powerr supply supply cables cables to the the PDU (for (for AC power power supply supply only) only)..
7
Conne Connect ct the the eigh eightt PSU PSU input input feed feeds s to the the PDU. PDU.
8
Switc Switch h on the the site site power power sup supply ply to to the the PDU. PDU.
9
Switc Switch h on the the circ circuit uit brea breake kers rs on on the the PDU. PDU.
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing a power supply unit
8 Replacing a power supply unit Before you start
f
Electrostatic Electrostat ic discharge (ESD) may damage components in the module or other units. Wear an ESD wrist strap or use a corresponding method when handling the units, and do not touch the connector surfaces.
8.1 Removing a power supply unit 8.1.1
Steps 1
Option
Description
If
there is PDU used with the BCN module,
Then
Switch off the circuit breaker on the PDU for the power supply unit to be replaced.
2
Unplug Unplug the the power power cable cable connecte connected d to the powe powerr supply supply unit. unit.
3
Unscrew Unscrew the the two thumbs thumbscre crews ws attach attaching ing the the power power supply supply unit unit to the the BCN. The Phillips screws are built into the power supply unit and can be loosened either by hand or with a screwdriver.
4
Pull the power power suppl supply y unit unit out out from from the the BCN BCN module module.. Removing an AC PSU from the BCN module
DN0960151
Issue: 02A
DN09109953
43
Replacing a power supply unit
Replacing Multicontroller RNC Hardware Units
Removing a DC PSU from the BCN module
V 8
4 K O P
N T R
V 8
4 K O P
N T R
DN0960163
8.2 Installing a power supply unit 8.2.1
Steps 1
Insert Insert the powe powerr supply supply unit to to its slot slot at the rear rear side side of the the BCN module module so the screws built into the unit are on the right-hand side.
2
Tigh Tighten ten the the unit unit’s ’s thumb thumbscr screw ews. s. The Phillips screws are built into the power supply unit and can be tightened either by hand or with a screwdriver.
3
Plug Plug the the power power cabl cable e to the the powe powerr supply supply unit unit..
4
Attac Attach h the the cab cable le clam clamp p to to the the cabl cable. e.
5
44
Option
Description
If
there is PDU used with the BCN module,
Then
Switch on the circuit breaker on the PDU for the power supply unit, which was replaced.
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Replacing the air filter
9 Replacing the air filter Purpose Inspect the air filter regularly. To prevent dust from accumulating inside the equipment, the filter element should be replaced twice a year.
Steps 1
Unscrew Unscrew the the two thumbs thumbscre crews ws attach attaching ing the the air filer filer cover cover to the the BCN module. module. Figure 22
2
Unscrewing the two thumbscrews
Open Open the the air air filter filter cover cover and and pull pull out out the the air air filter filter.. Figure 23
Openning the air filter cover and pulling out the air filter
DN0960112
Issue: 02A
3
Push the the new air air filter filter into into the guide guide rails rails on both both sides sides of of the air air filter filter cover. cover.
4
Push the air air filter filter cover cover back back and faste fasten n the two thumb thumbscre screws. ws.
5
Record Record the the date date of of the the air air filte filterr chan change ge..
DN09109953
45
Dealing with sensor alarms
Replacing Multicontroller RNC Hardware Units
10 Dealing with sensor alarms Symptoms An alarm about the sensor value of an Field Replacement Replacement Unit (FRU) is received. The following is an example of the alarm: Alarm ID: 2813 Specific problem: 70307 - VOLTAGE OUT OF LIMIT Managed object: fshwModuleId=addin-5,fshwPIUId=piu1,fshwEquipmentHolderId=chassis-2,fshwEquipmentHolderId=cabinet-1, fsFragmentId=HW,fsClusterId=ClusterRoot Severity: 2 (critical) Cleared: no Clearing: automatic Acknowledged: no Ack. user ID: N/A Ack. time: N/A Alarm time: 2012-03-12 09:08:29:940 EET Event type: x5 (equipment) Application: fshaProcessInstanceName=HPIMonitor,fshaRecoveryUnitName=FSHPIMonitorServer,f sipH ostName=CFPU0,fsFragmentId=Nodes,fsFragmentId=HA,fsClusterId=ClusterRoot IAppl Addl. Info: Unit={BCNOC-A} Position=/chassis-2/slot-5 Sensor={number=218,Name=VDD_QLM3} Appl. Addl. Info: 0.044 Notification ID: 8422 Extended event type : x1 (raise) Control indicator: 7 (full visible)
Recovery procedures 1.
Determine the sensor name from the Sensor field of the IAppl Addl. Info section of the alarm. 2. Determine the effected FRU from the Position field of the IAppl Addl. Info section of the alarm. 3. Check the sensor data of the FRU in trouble with the help of the sensor name and FRU name. BCN includes several sensors that report on hardware conditions. Many of the sensor readings can be used to diagnose the hardware fault. Follow the steps below to check the sensor data of the FRU in trouble: a) Check the LMP version. Issue the following command to show the LMP version: sw_fw_versioninfo
Example: root@LMP-1-2-1:~# sw_fw_versioninfo Active U-Boot Version 5.3.0 (in flash 0) Backup U-Boot Version 5.3.0 (in flash 1) LMP Version 5.3.0 PCB Version A104-3 LED CPLD Version 05 PCI-LPC bridge XP2 Version 05 VCMC Version 5.3.0
46
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Dealing with sensor alarms
PWR1014 Version 0007 FRUD Version 5.3.0 Part Number C111721.B3B Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
1 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
2 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
3 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
4 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
5 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
6 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card MMC Version Part Number BCNOC-A PCPL Version BCNOC-A OCTF Version BCNOC-A FRUD Version
7 4.2.3 C111723.A1A 0.3.0 4.2.3 4.2.3
Add-in Card 8 MMC Version 4.2.3 Part Number C111723.A1A
Issue: 02A
DN09109953
47
Dealing with sensor alarms
Replacing Multicontroller RNC Hardware Units
BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 AMC MMC Version Part Number hdsam-a_ad_frud Version
1 1.10 C110598.B3A 01.10.0000
AMC 2 PSU info 0 Board Mfg : EMERSON Board Product : BAFE-B Board Serial : TR120201616 Board Part Number : C112156.C1A Board Extra : 1f01 Product Manufacturer : EMERSON Product Name : DS1200-3-007 Product Part Number : PSU Product Version : 04 Product Serial : I510JS000H04P PSU info 1 Board Mfg : EMERSON Board Product : BAFE-B Board Serial : TR120201617 Board Part Number : C112156.C1A Board Extra : 1f01 Product Manufacturer : EMERSON Product Name : DS1200-3-007 Product Part Number : PSU Product Version : 04 Product Serial : I510JS000J04P
b) Check sensors. 1.
List the hardware sensors. Issue the following command to list all the hardware sensors: mch_cli ShowSensor
g
This will list all the sensors, with the Logical Unit Number (LUN) and sensor addresses, attached to a hardware unit. Example: mch_cli ShowSensor root@LMP-1-2-1:~# mch_cli ShowSensor Entity: Unknown Hot Swap PSU 1 0x00 0x43 PSU1 IN_Curr 0x00 0x2c PSU1 Fan 1 0x00 0x28 PSU1 Temp 2 0x00 0x26 PSU1 Temp 1 0x00 0x24 PSU1 Status 0x00 0x22 PSU1 OUT_Curr 0x00 0x20 PSU1 OUT_3V3 0x00 0x1e
48
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Dealing with sensor alarms
PSU1 OUT_12V PSU1 INPUT Entity: BAFE-B Hot Swap PSU 2 PSU2 IN_Curr PSU2 Fan 1 PSU2 Temp 2 PSU2 Temp 1 PSU2 Status PSU2 OUT_Curr PSU2 OUT_3V3 PSU2 OUT_12V PSU2 INPUT Entity: BMFU-A Hot Swap CU 1 Fan 2 Fan 1 Entity: BMFU-A Hot Swap CU 2 Fan 4 Fan 3 Entity: BAFU-A Hot Swap CU 3 Fan 6 Fan 5 Entity: BCNMB-A POST Error LMP Reset SEL status BMC Watchdog CLOCK_IRQ Reset Button VCCA 1.0V 1.25V_GE 1.25V_XG VCC3 VCC5 12V 0.9V 1.8V 1.1V 3.3SB Inlet3 Temp Inlet2 Temp Outlet Temp Inlet1 Temp BCM56820 Temp BCM56512 Temp LMP Temp PEX Temp InterSrc NewSel InterSrc Loss Sync2 NewSel Sync2 Loss
Issue: 02A
0x00 0x1c 0x00 0x1a 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x44 0x2d 0x29 0x27 0x25 0x23 0x21 0x1f 0x1d 0x1b
0x00 0x45 0x00 0x0a 0x00 0x09 0x00 0x46 0x00 0x0c 0x00 0x0b 0x00 0x47 0x00 0x0e 0x00 0x0d 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x48 0x3d 0x3c 0x3b 0x30 0x2e 0x19 0x18 0x17 0x16 0x15 0x14 0x13 0x12 0x11 0x10 0x0f 0x08 0x07 0x06 0x05 0x04 0x03 0x02 0x01 0xf5 0xf4 0xf3 0xf2
DN09109953
49
Dealing with sensor alarms
Replacing Multicontroller RNC Hardware Units
Sync1 NewSel 0x00 Sync1 Loss 0x00 External alarm 0x00 Entity: Unknown AMC at Hot Swap AMC 1 0x00 Entity: Unknown AMC at Hot Swap AMC 2 0x00 Entity: Unknown AMC at Hot Swap AMC 3 0x00 Entity: Unknown AMC at Hot Swap AMC 4 0x00 Entity: Unknown AMC at Hot Swap AMC 5 0x00 Entity: Unknown AMC at Hot Swap AMC 6 0x00 Entity: Unknown AMC at Hot Swap AMC 7 0x00 Entity: Unknown AMC at Hot Swap AMC 8 0x00 Entity: Unknown AMC at Hot Swap AMC 9 0x00 Entity: Unknown AMC at Hot Swap AMC 10 0x00 Entity: CPU 1 RESET_TYPE 0x00 BOOT 0x00 BOOT_ERROR 0x00 VDD_QLM3 0x00 VDD_QLM2 0x00 VDD_QLM1 0x00 VDD_QLM0 0x00 VDD_VTT0 0x00 DDR_VDD 0x00 VDD_OCORE 0x00 MON_3VSB 0x00 MON_12V 0x00 Tmp421 Temp 0x00 BMC Watchdog 0x00 Entity: CPU 2 RESET_TYPE 0x00 BOOT 0x00 BOOT_ERROR 0x00 VDD_QLM3 0x00 VDD_QLM2 0x00 VDD_QLM1 0x00 VDD_QLM0 0x00 VDD_VTT0 0x00 DDR_VDD 0x00 VDD_OCORE 0x00 MON_3VSB 0x00 MON_12V 0x00 Tmp421 Temp 0x00 BMC Watchdog 0x00 Entity: CPU 3 RESET_TYPE 0x00
50
0xf1 0xf0 0xfe 00 0x31 00 0x32 00 0x33 00 0x34 00 0x35 00 0x36 00 0x37 00 0x38 00 0x39 00 0x3a 0x5d 0x5c 0x5b 0x5a 0x59 0x58 0x57 0x56 0x55 0x54 0x53 0x52 0x51 0x50 0x7d 0x7c 0x7b 0x7a 0x79 0x78 0x77 0x76 0x75 0x74 0x73 0x72 0x71 0x70 0x9d
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Dealing with sensor alarms
BOOT BOOT_ERROR VDD_QLM3 VDD_QLM2 VDD_QLM1 VDD_QLM0 VDD_VTT0 DDR_VDD VDD_OCORE MON_3VSB MON_12V Tmp421 Temp BMC Watchdog Entity: CPU 4 RESET_TYPE BOOT BOOT_ERROR VDD_QLM3 VDD_QLM2 VDD_QLM1 VDD_QLM0 VDD_VTT0 DDR_VDD VDD_OCORE MON_3VSB MON_12V Tmp421 Temp BMC Watchdog Entity: CPU 5 RESET_TYPE BOOT BOOT_ERROR VDD_QLM3 VDD_QLM2 VDD_QLM1 VDD_QLM0 VDD_VTT0 DDR_VDD VDD_OCORE MON_3VSB MON_12V Tmp421 Temp BMC Watchdog Entity: CPU 6 RESET_TYPE BOOT BOOT_ERROR VDD_QLM3 VDD_QLM2 VDD_QLM1 VDD_QLM0 VDD_VTT0 DDR_VDD VDD_OCORE MON_3VSB
Issue: 02A
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x9c 0x9b 0x9a 0x99 0x98 0x97 0x96 0x95 0x94 0x93 0x92 0x91 0x90
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xbd 0xbc 0xbb 0xba 0xb9 0xb8 0xb7 0xb6 0xb5 0xb4 0xb3 0xb2 0xb1 0xb0
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0xdd 0xdc 0xdb 0xda 0xd9 0xd8 0xd7 0xd6 0xd5 0xd4 0xd3 0xd2 0xd1 0xd0
0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0x0d 0x0c 0x0b 0x0a 0x09 0x08 0x07 0x06 0x05 0x04 0x03
DN09109953
51
Dealing with sensor alarms
Replacing Multicontroller RNC Hardware Units
MON_12V Tmp421 Temp BMC Watchdog Entity: CPU 7 RESET_TYPE BOOT BOOT_ERROR VDD_QLM3 VDD_QLM2 VDD_QLM1 VDD_QLM0 VDD_VTT0 DDR_VDD VDD_OCORE MON_3VSB MON_12V Tmp421 Temp BMC Watchdog Entity: CPU 8 RESET_TYPE BOOT BOOT_ERROR VDD_QLM3 VDD_QLM2 VDD_QLM1 VDD_QLM0 VDD_VTT0 DDR_VDD VDD_OCORE MON_3VSB MON_12V Tmp421 Temp BMC Watchdog Entity: AMC 1 Version change DC/DC Failure MMC Temp HDD Temp +5V Backend +12V Backend +3.3V MP +12V Payload Hot Swap
2.
0x01 0x02 0x01 0x01 0x01 0x00 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0x2d 0x2c 0x2b 0x2a 0x29 0x28 0x27 0x26 0x25 0x24 0x23 0x22 0x21 0x20
0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0x4d 0x4c 0x4b 0x4a 0x49 0x48 0x47 0x46 0x45 0x44 0x43 0x42 0x41 0x40
0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01 0x01
0x68 0x67 0x66 0x65 0x64 0x63 0x62 0x61 0x60
Get the hardware sensor threshold. Find the sensor address of the sensor you want and issue the following command to get the sensor threshold of the hardware unit: mch_cli GetSensorThreshold
Example: root@LMP-1-2-1:~# mch_cli GetSensorThreshold 0x00 0xda sensor VDD_QLM3 (218) Lower Non-Critical : NA Lower Critical : NA Lower Non-Recoverable : 1.1720 Upper Non-Critical : NA Upper Critical : NA
52
DN09109953
Issue: 02A
Replacing Multicontroller RNC Hardware Units
Dealing with sensor alarms
Upper Non-Recoverable : 1.2920 support thresholds: lnr unr
3.
Get the hardware sensor data. Issue the following command to get the sensor data of the hardware unit you want: mch_cli ReadSensor
Example: root@LMP-1-2-1:~# mch_cli ReadSensor 0x00 0xda Sensor: VDD_QLM3 Lun: 0x00 Number: 0xda Value: 1.236000
4. If the sensor value is not within the Lower and Upper threshold value, try to replace the FRU in trouble to correct the sensor value. See chapter Replacing hardware units for details. 5. If the problem persists even after following the instructions, instructi ons, please contact your local Nokia Solutions and Networks representative with your observations on the sensor values.
Position of the PSUs and fan trays The following picture shows the positions of the PSUs and fan trays. Figure 24
Issue: 02A
Positions of the PSUs and fan trays
DN09109953
53
Communication between active and standby units in a BCN cluster fails
Replacing Multicontroller RNC Hardware Units
11 Communication between active and standby units in a BCN cluster fails 11.1 Description The communication failure between active and standby units in a box controller node (BCN) cluster for a long time will cause a split-brain situation. If the cluster internal network connection between BCN modules fails, the cluster may get partitioned into two independent parts, which attempt to provide the same services. As a result, the BCN modules do not function properly. Possible causes for the problem are: • • • •
Improper cabling between the BCN boxes. Tampering of cables connecting the BCN boxes. Incorrect switch configurations. configurations. Malfunctioning Malfunctioni ng of hardware or embedded software.
11.2 Symptoms Improper handling of the hardware might lead to a scenario where two isolated parts of the BCN cluster are running and trying to provide the same services. In this split-brain situation, the following problems might occur: • •
• •
Storage resources replicated using Distributed Distribute d Replicated Block Device (DRBD) get updated independently on both sides. As both CLA nodes run an independent instance of the cluster management software, nodes may get reset continuously because, they can communicate only with one management software at a time. External IP addresses will be assigned to both the units which cause IP address conflicts and various communication errors. The CLA nodes might reboot continuously.
11.3 Recovery procedures 1
Powe Powerr-of offf all all the the BCN BCN modu module les. s.
2
Power Power-on -on one one of the the CLA CLA node node BCN BCN mod module ules. s.
3
Wait ait till till the the BCN BCN mod modul ule e star starts ts.. Wait till the BCN module starts. In case, no console connection is available, just wait for 3 minutes.
54
DN09109953
Issue: 02A
Replac Replacing ing Multic Multicon ontro trolle llerr RNC Hardw Hardwar are e Units Units
4
Commu Communic nicati ation on betwe between en active active and and stand standby by units units in a BCN cluster fails
Power Power-on -on the the rem remain aining ing BCN modul modules es.. If the CLA node is up and running in the powered on BCN module, power-on the remaining BCN modules. This will overwrite the disk devices of the last activated unit with the copies of the unit that was started up first.
5
Verify erify that that all the node nodes s and services services are up up and and running running.. To verify that the split-brain situation is over and all the services are up and running, enter the following command: show has summary
g
The example displays a 2 BCN configuration. If there are more BCNs, the display is different. _nokadmin@CFPU-0 [RNC-37] > show has summary CFPU-0@RNC-37 Node status Nodes in configuration : 16 Unlocked nodes : 16 RG status RGs in configuration Unlocked RGs
: 68 : 68
RU status RUs in configuration Unlocked RUs
: 274 : 274
Process status Processes in configuration Unlocked processes
: 1482 : 1482
[2013-05-09 10:07:57 +0200]
If there are differences between Nodes in configuration and Unlocked nodes, then splitbrain is still active.
g
Issue: 02A
There still might be differences in Node status even if split-brain is over. Please check if there are nodes that are locked. If yes, unlock them and try once again.
DN09109953
55