Case Study
HOW TO DEFINE THE PARAMETERS AND MAKE THE CONFIGURATION OF
SERVICE DESK MANAGEMENT SYSTEM IN A MANAGED SERVICES PROJECT
Case Study Version: 1.0
2015-02-03
Huawei Proprietary – For Internal Use Only
Page1, Total14
Case Study
Preface
This document gives details to support the configuration of one of the main tools in a Managed Services Project as SDM (Service Desk Management), in old versions named MOS7100. These tools are used to document all the Network Operations executed by Huawei Employees, Huawei Subcontractors and Customer Third Suppliers. Inside of the configuration should be included all parameters needed to prepare any report and classification requested by Customer and Huawei Internal Regulations. According to Managed Services Projects and Procedures, we have three mains types of Network Operations:
Fault Management.
Performance Management.
Preventive Routines Management.
All these operations are regulated by the Contract between Huawei Solutions Sales Department and Customer, inside of the Contract are specified the conditions and responsibilities of Huawei and customer during the Contract Period. One of the main conditions of the contract is the Service Level Agreement (SLA), in here we have the parameters to define the classification of all events by service impact, response times, resolve times, routines frequency, network performance parameters agreed and any other service indicator that customer request and Huawei accept. The contract parameters included in the last paragraph must be integrated to Service Desk Management tool, and our objective is provide a guide of how to define it.
Version 1.0 2015-02-03
Date January 30th, 2015
Author Javier Segura
Approver Edwin Salazar
Huawei Proprietary – For Internal Use Only
Remarks Final Page2, Total14
Case Study
Contents Preface ..............................................................................................2 Project Introduction .............................................................................4 Infrastructure Failures ....................................................................... 4 Access and Transmission Failures ........................................................ 4 Radiant System Failures ..................................................................... 4 Core Failures .................................................................................... 5 Definitions in Service Desk Management .................................................5 Fault Level ....................................................................................... 5 Fault Occur Time ............................................................................... 6 Fault Recovery Time .......................................................................... 6 Responsible of the Failure (Vendor) ..................................................... 7 Root Cause and Sub Cause ................................................................. 7 Other important parameters for SDM ......................................................9 Users organization and distribution ...................................................... 9 Sites information ............................................................................. 10 Rules for notifications ...................................................................... 12 SLA (Service Level Agreement) ......................................................... 13 Conclusions and Advantages of SDM Full Configured ............................... 14
2015-02-03
Huawei Proprietary – For Internal Use Only
Page3, Total14
Case Study
Project Introduction
In December 31th 2013 the Managed Services Contract of Telefonica Costa Rica expired, several conditions changed with 2014 negotiation, and with this our Service Desk Management and reports to customer had to change several parameters to satisfy the new Scope of Work that Telefonica request. According to the new contract, network faults must be divided in three groups:
Infrastructure Failures.
Access and Transmission Failures.
Radiant System Failures.
Core Failures.
Infrastructure Failures Here is included all the faults that involve power equipments, sensors, cabinets, air conditioning, physical problems in site security or access, temperature issues and any other problem not related to Telecom Equipments.
Access and Transmission Failures Faults affecting the transmission or base station equipments, this is related to equipment boards and components, optical fiber or any other network cable inside of the cabinet.
Radiant System Failures In this category we have the failures that affect components between the cabinets and the antennas in tower. For instance: RRUs, Jumpers, ODUs, IF or RF Cables, RF and Microwave Antennas. After this review, the next step is design how to separate this in Service Desk Management and how to classify the internal categories of all failure types. This was defined according to our criteria and approved by customer to negotiate the content of all the services reports. 2015-02-03
Huawei Proprietary – For Internal Use Only
Page4, Total14
Case Study
Core Failures In Core Room of the network we have the main equipments to provide all the services to the end users, any failure in some of this equipments would be included in this category.
Definitions in Service Desk Management
There are several parameters to define in Service Desk Management that must be related to the project, we have to analyze them one by one and find the utilization of each one in the reports to classify and separate all the failures and this will allow to create several reports and classifications depending of what we need to analyze in the project. For Telefonica Costa Rica Managed Services Project, after several meetings with Customer Engineers, we include these parameters to analyze the project and network status, and as a main purpose check weekly the SLA accomplishment:
Fault Level
Fault Occur Time
Fault Recovery Time
Responsible of the Failure (Vendor)
Root Cause
Sub Cause
Fault Level It’s to define the impact of the failure in the network. According to Telefonica’s Regional Standards, they have 5 Fault Levels: Fault Level Name
Impact in
(Telefonica Standard)
the Network
SF1 SF2
Service Impact of 80 or more Sites. Critical
SF3
2015-02-03
Description
Service Impact from 16 to 79 Sites. Service Impact from 11 to 15 Sites.
SF4
Major
Service Impact from 1 Cell to 10 Sites.
SSF
Minor
Failure with no Service Impact.
Huawei Proprietary – For Internal Use Only
Page5, Total14
Case Study
With this parameter now we can classify the failures according to its impact in the network, but as indicated in Introduction, we still have to divide in Infrastructure, Access and Transmission, Radiant System and Core. For this we divide the impact of the network also by type of failure, causing 20 different types of Fault Levels: Fault Level in SDM SF1 - Acceso y Tx SF2 - Acceso y Tx SF3 - Acceso y Tx SF4 - Acceso y Tx SSF - Acceso y Tx SF1 - Infraestructura SF2 - Infraestructura SF3 - Infraestructura SF4 - Infraestructura SSF - Infraestructura SF1 - Sist. Radiante SF2 - Sist. Radiante SF3 - Sist. Radiante SF4 - Sist. Radiante SSF - Sist. Radiante SF1 - Core SF2 - Core SF3 - Core SF4 - Core SSF - Core Customer communicates in Spanish, for this all parameters are in that language.
Fault Occur Time Is the time when the alarm appears in EMS (Element Management System). Monitoring Engineers input in the tickets the time from EMS in SDM.
Fault Recovery Time Time when the failure is resolved, it’s also obtained from EMS.
2015-02-03
Huawei Proprietary – For Internal Use Only
Page6, Total14
Case Study
Responsible of the Failure (Vendor) Here we include the company responsible of the failure, and this field is very important because it’s the cases where we specify if are attributable to Huawei or not. For Telefonica Costa Rica, we must include all these: Responsibility of
Description
Failures Optical Supplier A
Optical Transmission supplier of Telefonica.
Compañía Electrica
Power Supplier.
Huawei
Faults caused by us or our equipments.
Telefonica
Failures attributable to Telefonica or Telefonica
Telefonica Empresas
Enterprise Customers Department.
Vandalismo
Vandalism cases.
Global Crossing
International Service Provider of Telefonica.
TIWS Telecom Supplier XXX
Failures caused by our competitors. These companies are the Towers owners, they rent the sites field and build the towers structure. So Telefonica
Tower Companies
and other operators can rent tower spots to install the antennas. They are responsible of all sites access and construction issues.
Root Cause and Sub Cause These parameters were included to specify more deeply the network failures, are used also to separate the Infrastructure, Access and Tx and Radiant System cases. Customer was very strict reviewing this, in order to classify with most precision possible the failures. In the following table we can see the root cause and sub causes included for this project:
2015-02-03
Huawei Proprietary – For Internal Use Only
Page7, Total14
Case Study
2015-02-03
Huawei Proprietary – For Internal Use Only
Page8, Total14
Case Study
Other important parameters for SDM
There is more information we need take in consideration to have a proper configuration of SDM:
Users organization and distribution
Sites information
Rules for Notifications
SLA (Service Level Agreement)
Users organization and distribution To create all project users in SDM, it’s very important analyze the project staff organization and define the users distribution in the same way of the project. According to the Managed Services Contract, the project staff is organized in this way:
O&M Manager (Project Manager) o
o
NOC Manager
Back Office Engineers
Front Office Engineers
FLM Manager
Field Maintenance Engineers
We must create the Project Organization in the system:
2015-02-03
Huawei Proprietary – For Internal Use Only
Page9, Total14
Case Study
In that image we can appreciate the internal Project Organization, but in most of cases, we also need give access to customer, because they always want to check by themselves any case they need. For our project, the customer request was more than only the access, they asked receive SMS/Email notifications from MOS7100 and/or SDM, so we needed to create their personal users and also define some groups according to Failures Fault Levels and the people included in the notification:
The instructions to create the users in the system are very clear, we just need to follow the Import Template documents content and with that all the users will be included with just one or more Excel files.
Sites information One of the most important parameters creating the Tickets is the site information, here is included the Site ID, Site Name, coordinates, region, maintainer and any other key information we want to include for Front Office, Back Office or Field Maintenance Engineers. If SDM is connected to WFM (Work Force Management System, this is very useful because the FME can see all the site information if his mobile phone. Before include all the sites, we need to feel all the fields needed in the Resource Parameters, here we define the most of information to be included in the tickets workflow modules:
2015-02-03
Huawei Proprietary – For Internal Use Only
Page10, Total14
Case Study
All the parameters have their own file import template, this means once defined we just include all in the Excel and import to the system. One of the most important fields to define in sites is the Region, because here we set who is in charge of the site maintenance. For Costa Rica Project, we classify the sites in 3 Parent Regions and 13 Child Regions:
Every Child Region means a Field Maintenance Team.
2015-02-03
Huawei Proprietary – For Internal Use Only
Page11, Total14
Case Study
For every new site integrated to the project, first of all we need to classify it by region, site type according to contract classification and any other parameter we need.
Rules for notifications SDM has the capability to send SMS and Email alerts any time we need, so with this we can configure to send it automatically one or both communications to be sent when the SDM operator submit the information. This is a very specific work, because we have to configure all the rules we want according to our needs. To avoid extent too much this document, we will take as example the configuration of the notification of Critical TT Creation, the purpose of this rule is send a SMS and Email every time that Front Office Engineer submit a critical TT to SDM. The steps of the configuration are the following: 1. Set the moment when the rule have to be activated:
2. Set the condition to activate the rule. In our example case, is any time that Front Office create a TT with SF1, SF2 or SF3 fault level, in all categories:
3. Set the content of the messages we want to send, It’s better include to separated actions, one for Email and one for SMS:
2015-02-03
Huawei Proprietary – For Internal Use Only
Page12, Total14
Case Study
4. After all actions included, just save the changes and last step is configure the name of the rule. We can include all the rules we need; in our project we use some rules for internal notifications and others for customer. This has to be used very carefully because we need analyze the content we send to customer, keeping the information security of the project.
SLA (Service Level Agreement) The SLA in the system helps to organize all the teams to provide a better service to customer. With this in the system we can send notifications about SLA time limitations during the failures, include colors in tickets that symbolize the SLA status. There are some concepts we need to know to make the configuration:
Measure Item: it’s the formulas we want to calculate, it’s always a subtraction of times, the most common is “Required Finish Time” that is the Fault Recovery Time – Fault Occur Time.
Target: the targets are the commitments we want to accomplish in the project and with all the conditions we need.
For Telefonica Costa Rica
case, we had to configure just the Required Finish Time for our types of failures. Only Radiant System has 24 hours, the rest must be recovered in 3 hours.
Milestone: this is some time we want to define before finish the SLA time. In our project we set it in 50% of the SLA Remaining Time.
2015-02-03
Huawei Proprietary – For Internal Use Only
Page13, Total14
Case Study
As specified in the beginning, we separate our types of failures with the Fault Level; this means that to configure the SLA we just had to make conditions according to the Fault Level to set the SLA time during the Trouble Ticket:
When we configure the conditions, we can include notifications for Milestone and Target, this ensures that the staff assigned to solve the failure and the related project staff receive the SLA status of the case at the half of time remaining and also when the SLA expires. Actually we have SMS and Email for this, the difference between both is the size of the message, SMS has less information than Email.
Conclusions and Advantages of SDM Full Configured With SDM or MOS7100 configured properly, we can get several kind of reports and statistics of the project, also to all project staff we can review their performance and attention to project activities, also they can receive notifications about what they need to do in order to keep the project and network with the best KPI’s possible. In customer relationship, these tools help to increase customer confidence in Huawei, because it’s a formal tool that they can access any time and check complete tracking of the project activities and also check the service and contract status ensuring if we are accomplishing the terms aggreed or not.
2015-02-03
Huawei Proprietary – For Internal Use Only
Page14, Total14