Application Delivery Handbook

The 2008 Handbook of

Application Delivery A Guide to Decision Making By Dr. Jim Metzler Platinum Sponsors

Kubernan Guiding Innovation

www.kubernan.com

Application Delivery Handbook | february 2008

The Handbook of

Application Delivery IT Innovation Report

Contents Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Published By Kubernan www.Kubernan.com

Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Cofounders Jim Metzler [email protected]

The Applications Environment. . . . . . . . . . . . . . . . . . . . . . . . 8

Steven Taylor [email protected]

Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Design/Layout Artist Debi Vozikis

Network and Application Optimization. . . . . . . . . . . . . . . . 24

Copyright © 2008 Kubernan

Managed Service Providers. . . . . . . . . . . . . . . . . . . . . . . . . 43 Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 The Changing Network Management Function. . . . . . . . . . 61 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Interviewees. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

For Editorial and Sponsorship Information Contact Jim Metzler or Steven Taylor Kubernan is an analyst and consulting joint venture of Steven Taylor and Jim Metzler. Professional Opinions Disclaimer All information presented and opinions expressed in this IT Innovation Report represent the current opinions of the author(s) based on professional judgment and best available information at the time of the presentation. Consequently, the information is subject to change, and no liability for advice presented is assumed. Ultimate responsibility for choice of appropriate solutions remains with the reader.

Appendix - Advertorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

A Guide to Decision Making

2


Executive Summary

nerabilities. That narrow focus combined with the fact that

We are just ending the first phase of a fundamental transformation of the IT organization.

At the beginning

of this transformation, virtually all IT organizations were comprised of myriad stove piped functions; e.g., devices, networks, servers, storage, databases, security, operating systems. A major component of the transformation is that leading edge IT organizations are now creating an environment that is characterized by the realization that IT is comprised of just two functions, application development and application delivery, and that these functions must work in an integrated fashion in order for the IT organization to ensure acceptable application performance. This view of IT affects everything – including the organizational structure, the management metrics, the requisite processes, technologies and tools. One of the primary goals of this handbook is to help IT organizations plan for that transformation. As described in the handbook, the activities that comprise a successful application delivery function are planning, optimizing, managing and controlling application performance. Each of these activities is challenging today and will become more challenging over the next few years. As described in Chapters 2 and 3, part of the increased challenge will come from the deployment of new application development paradigms such as SOA (Services Oriented Architecture), Rich Internet Architecture and Web 2.0. Also adding to the difficulty of ensuring acceptable application performance is the increased management complexity associated with the burgeoning deployment of the virtualization of IT resources (i.e., desktops, servers, storage, applications), the growing impact of wireless communications, the need to provide increasing levels of security as well as emerging trends such as storage optimization. Chapter 4 of this handbook discusses planning. As that chapter points out, in most companies the focus of application development is on ensuring that applications are developed on time, on budget, and with few security vul-

application development has historically been done over a high-speed, low-latency LAN, means that the impact of the WAN on the performance of the application is generally not known until after the application is fully developed and deployed. In addition, most IT organizations do not know the impact that a major change, such as consolidating data centers, will have until after the initiative is fully implemented.

As a result, IT organizations are left to

react to application and infrastructure issues typically only after they have impacted the user. Chapter 4 discusses techniques such as WAN emulation, baselining and predeployment assessments that IT organizations can use to identify and eliminate issues prior to their impacting users and identifies criteria that IT organizations can use to choose appropriate tools. Chapter 5 discusses two classes of network and application optimization solutions. One class focuses on the negative effect of the WAN on application performance. This category is referred to alternatively as a WAN optimization controller (WOC) or a Branch Office Optimization Solution. Branch Office Optimization Solutions are often referred to as symmetric solutions because they typically require an appliance in both the data center as well as the branch office. Some vendors, however, have implemented solutions that call for an appliance in the data center, but instead of requiring an appliance in the branch office only requires software on the user’s computer. This class of solution is often referred to as a software only solution and is most appropriate for individual users or small offices. Chapter 5 contains an extensive set of criteria that IT organizations can use to choose a Branch Office Optimization Solution. The second class of solution discussed in Chapter 5 is often referred to as an Application Front End (AFE) or Application Device Controller (ADC). This solution is typically referred to as being an asymmetric solution because an appliance is only required in the data center and not the


3


branch office. The primary role of the AFE is to offload

before the IT function does and that this results in IT look-

computationally intensive tasks, such as the processing of

ing like “a bunch of bumbling idiots.” Chapter 7 discusses

SSL traffic, from a server farm. Chapter 5 also contains

this issue as well as some of the organizational issues that

an extensive set of criteria that IT organizations can use to

impact successful application delivery, including the lack of

choose an AFE.

effective processes as well as the adversarial relationship

Today most IT organizations that have deployed a network and application optimization solution have done so in a do-it-yourself (DIY) fashion. Chapter 6 describes another alternative – the use of a managed service provider (MSP) for application delivery services. MSPs are not new. For example, in the early to mid 1990s, many IT organizations began to acquire managed frame relay services from an MSP as an alternative to building and managing a frame relay network themselves. In most cases, the IT organization was quite capable of building and managing the frame relay network, but choose not to do so in order to focus

that often exists between the application development organization and the network organization. Chapter 7 also discusses the fact that most IT organization are blind to the growing number of applications that use port 80 and describes a number of management techniques that IT organizations can use to avoid the “bumbling idiot syndrome”. This includes discovery, end-to-end visibility, network analytics and route analytics. The chapter identifies criteria that IT organizations can use to choose appropriate solutions and also includes some specific suggestions for how IT organizations can manage VoIP.

its attention on other activities or to reduce cost. Part of

Chapter 8 examines the attempt on the part of many

the appeal of using an MSP for application delivery is that

Network Operations Centers (NOCs) to improve their

in many instances MSPs have expertise across all of the

processes, and highlights the shift that most NOCs are

components of application delivery (planning, optimization,

taking from where they focus almost exclusively on the

management and control) that the IT organization does not

availability of networks to where they are beginning to also

possess. As a result, these MSPs can provide functional-

focus on the performance of networks and applications.

ity that the IT organization on its own could not provide.

Included in the chapter is a discussion of the factors that

As is described in Chapter 6, there are two distinct classes

are driving the NOC to change as well as the factors that

of application delivery MSPs that differ primarily in terms

are inhibiting the NOC from being able to change. Chapter

of how they approach the optimization component of

8 details how the approach that most IT organizations take

application delivery. One class of application delivery MSP

to reducing the mean time to repair has to be modified

provides site-based services that are similar to the current

now that the NOC is gaining responsibility for application

DIY approach used by most IT organizations. The other

performance and the chapter also examines the myriad

class of application delivery MSP adds intelligence to the

techniques that IT organizations use to justify an invest-

Internet to allow it to be support production applications.

ment in performance management. Chapter 8 concludes

Chapter 6 contains an extensive set of criteria that IT orga-

with the observation that given where NOC personnel

nizations can use to determine if one of these services

spend their time, that the NOC should be renamed the

would add value.

Application Operations Center (AOC).

As part of the research that went into the creation of

Chapter 9 examines the type of control functionality

this handbook, the CIO of a government organization was

that IT organizations should implement in order to ensure

interviewed. He stated that in his organization it is com-

acceptable application performance. This includes route

mon to have the end user notice application degradation

control as a way to impact the path that traffic takes as it


4


transits an IP network. The chapter also describes a pro-

The complexity associated with application

cess for implementing QoS and summarizes the status of

delivery will increase over the next few years.

current QoS implementations.

That follows in part because as explained in this hand-

Chapter 9 makes the assertion that firewalls are typi-

book, the deployment of new application development

cally placed at a point where all WAN access for a given

paradigms1 such as SOA (Services Oriented Architecture),

site coalesces and that this is the logical place for a policy

Rich Internet Architecture and Web 2.0 will dramatically

and security control point for the WAN.

Unfortunately

increase the difficulty of ensuring acceptable application

because traditional firewalls cannot provide the neces-

performance. It also follows because of the increasing

sary security functionality, IT organizations have resorted

management complexity associated with the burgeoning

to implementing myriad work-arounds.

This approach,

deployment of the virtualization of IT resources (i.e., desk-

however, has serious limitations including the fact that

tops, servers, storage, applications), the growing impact of

even after deploying the work-arounds the IT organization

wireless communications, the need to provide increasing

typically does not see all of the traffic and the deployment

levels of security as well as emerging trends such as stor-

of multiple security appliances significantly drives up the

age optimization.

operational costs and complexity. The chapter concludes by identifying criteria that IT organizations can use to choose a next generation firewall.

Instead of reaching a point where the challenges associated with application delivery are going away, we are just ending the first phase of a fundamental transformation of

Introduction

the IT organization. At the beginning of this transforma-

Background and Goal

stove piped functions. By stove piped is meant that these

tion, virtually all IT organizations were comprised of myriad

As recently as a few years ago, few IT organizations were

functions had few common goals, terminology, tools and

concerned with application delivery. That has all changed.

processes. A major component of the transformation is

Application delivery is now a top of mind topic for virtu-

that leading edge IT organizations are now creating an

ally all IT organizations. As is described in this handbook,

environment that is characterized by the realization that:

there are many factors that complicate the task of ensuring acceptable application performance. This includes the lack of visibility into application performance, the centralization

If you work in IT, you either develop applications or you deliver applications.

of IT resources, the decentralization of employees and the

Put another way, leading edge companies are creating an

complexity associated with the current generation of n-tier

IT organization that is comprised of two functions: applica-

applications.

tion development and application delivery. Both of these

Some of the IT organizations that were interviewed for this handbook want to believe that the challenges associ-

functions must work holistically in order to ensure acceptable application performance.

ated with application delivery are going away. They want

This view of IT affects everything – including the organi-

to believe that application developers will soon start to

zational structure, the management metrics, the requisite

write more efficient applications and that bandwidth costs

processes, technologies and tools. While the transforma-

will decrease to the point where they can afford to throw bandwidth at performance problems.

1 Kubernan asserts its belief that words such as paradigm and

holistically have been out of favor so long that it is now acceptable to use them again.


5


tion is indeed fundamental, it will not happen quickly. We

ery. As is discussed in chapter 6, one of the advantages of

have just spent the last few years coming to understand

using a managed service provider is that they often have

the importance and difficulty associated with application

the skills and processes that are necessary to bridge the

delivery and to deploy a first generation of tools typically

gap that typically exists within an IT organization between

in a stand-alone, tactical fashion. As we enter the next

the application development groups and the rest of the IT

phase of application delivery, leading edge IT organiza-

function.

tions will develop plans for how they want to evolve from a stove-piped IT infrastructure function to an integrated application delivery function.

Chapter 8 details the evolving network management function.

The includes a discussion of how the NOC,

which once focused almost exclusively on the availability

Senior IT management needs to ensure that

of networks, now often has an additional focus on the

their organization evolves to where it looks at

performance of networks and applications. Chapter 8 also

application delivery holistically and not just as

examines how the NOC has to change in order to reduce

an increasing number of stove-piped functions

the meant time to repair that is associated with applica-

This transformation will not be easy in part because it crosses myriad organizational boundaries and involves rapidly changing technologies that have never before been developed by vendors, nor planned, designed, implemented and managed by IT organizations in a holistic fashion. Successful application delivery requires the integration of tools and processes. One of the goals of this handbook is to help IT organizations plan for that transformation – hence the subtitle: A guide to decision making.

Forward to the 2008 Edition This handbook builds on the 2007 edition of the application delivery handbook. This edition of the handbook differs from the original version in several ways. First, information that was contained in the original version that is no longer relevant was deleted from this edition. Second, information was added to increase both the breadth and depth of this edition. For example, a significant amount of new market research is included. In addition, there are two new chapters in this edition. One of these new chapters, chapter 6 discusses the use the various types of managed service providers as a very viable option that IT organizations can use to better ensure acceptable application deliv-

tion performance issues and details the myriad ways that IT organizations justify an investment in performance management. Other areas that were either added or expanded upon include the: • Impact of Web services on security • Development of a new generation of firewalls • Use of WAN emulation to develop better applications and to plan for change • Impact of Web 2.0 on application performance and management • The criticality of looking deep into the packet for more effective management • Status of QoS deployment • Appropriate metrics for VoIP management • Development of software-based WAN optimization solutions • Factors that impact the transparency of WAN optimization solutions • Issues associated with high-speed data replication


6


• Criteria to evaluate WAN optimization controllers

relationship between and among the various components of the application delivery framework.

(WOCs) • Criteria to evaluate application front ends (AFEs) • Issues associated with port hopping applications Unfortunately, this is a lengthy handbook. It does not, however, require linear, cover-to-cover reading. A reader may start reading this handbook in the middle and use the references embedded in the text as forward and backwards pointers to related information. Several techniques were employed to keep the handbook a reasonable length. For example, the handbook allocates more space discussing a new topics (such as the impact of Web 2.0) than it does on topics that are relatively well understood – such as the impact of consolidating servers out of branch offices and into centralized data centers. Also, the handbook does not contain a detailed analysis of any technology. To compensate for this, the handbook includes an extensive bibliography. In addition, the body of the handbook does not discuss any vendor or any products or services. The Appendix to the handbook, however,

Given the breadth and extent of the input from both IT organizations and leading edge vendors this handbook represents a broad consensus on a framework that IT organizations can use to improve application delivery.

Context Over the last two years, Kubernan has conducted extensive market research into the challenges associated with application delivery. One of the most significant results uncovered by that market research is the dramatic lack of success IT organizations have relative to managing application performance. In particular, in Kubernan asked 345 IT professionals the following question. “If the performance of one of your company’s key applications is beginning to degrade, who is the most likely to notice it first – the IT organization or the end user?” Seventy three percent of the survey respondents indicated that it was the end user.

contains material supplied by the majority of the leading

In the vast majority of instances when a key

application delivery vendors.

business application is degrading, the end

To allow IT organizations to compare their situation to those of other IT organizations, this handbook incorporates

user, not the IT organization, first notices the degradation.

market research data that has been gathered over the last

The fact that end users notice application degradation

two years. The handbook also contains input gathered

prior to it being noticed by the IT organization is an issue

from interviewing roughly thirty IT professionals. Most IT

of significant importance to virtually all senior IT manag-

professionals cannot be quoted by name or company in a

ers. The Government CIO stated that in his organization

document like this without their company heavily filtering

the fact that the IT organization does not know when an

their input.

To compensate for that limitation, Chapter

application has begun to degrade has lead to the percep-

12 contains a brief listing of the people who were inter-

tion that IT is “a bunch of bumbling idiots.” He further

viewed, along with the phrase that is used in the handbook

revealed that this situation has also fostered an environ-

to refer to them. The sponsors of the handbook provided

ment in which individual departments have both felt the

input into the areas of this handbook that are related to

need and been allowed to establish their own shadow IT

their company’s products and services. Both the spon-

organizations.

sors and the IT professionals also provided input into the


7


In situations in which the end user is typically

Application delivery is more complex than just

the first to notice application degradation, IT

network and application acceleration.

ends up looking like bumbling idiots.

Application delivery needs to have top-

The current approach to managing application

down approach, with a focus on application

performance reduces the confidence that the

performance.

company has in the IT organization. In addition to performing market research, Kubernan also provides consulting services. Jim Metzler was hired by an IT organization that was hosting an application on the east coast of the United States that users from all over the world accessed. Users of this application that were

With these factors in mind, the framework this handbook describes is comprised of four primary components. Successful application delivery requires the integration of planning, network and application optimization, management, and control.

located in the Pac Rim were complaining about unaccept-

Some overlap exists in the model as a number of com-

able application performance. The IT organization wanted

mon IT processes are part of multiple components. This

Jim to identify what steps it could take to improve the per-

includes processes such as discovery (what applications

formance of the application. Given that the IT organization

are running on the network and how are they being used),

had little information about the semantics of the applica-

baselining, visibility and reporting.

tion, the task of determining what it would take to improve the performance of the application was lengthy and served

The Applications Environment

to further frustrate the users of the application. (Chapter

This section of the handbook will discuss some of the

7 details what has to be done to reduce the mean time to

primary dynamics of the applications environment that

repair application performance issues.) This handbook is

impact application delivery. It is unlikely any IT organiza-

being written with that IT organization and others like them

tion will exhibit all of the dynamics described. It is also

in mind.

unlikely that an IT organization will not exhibit at least

A goal of this handbook is to help IT organiza-

some of these dynamics.

tions develop the ability to minimize the occur-

No product or service in the marketplace provides a

rence of application performance issues and to

best in class solution for each component of the applica-

both identify and quickly resolve issues when

tion delivery framework. As a result, companies have to

they do occur.

carefully match their requirements to the functionality the

To achieve that goal, this handbook will develop a frame-

alternative solutions provide.

work for application delivery. It is important to note that

Companies that want to be successful with

most times when the industry uses the phrase application

application delivery must understand their cur-

delivery, this refers to just network and application opti-

rent and emerging application environment.

mization. Network and application optimization is important. However, achieving the goal stated above requires a broader perspective on the factors that impact the ability of the IT organization to assure acceptable application

The preceding statement sounds simple. However, less than one-quarter of IT organizations claim they have that understanding.

performance.


8


The Application Development Process In most situations the focus of application development is on ensuring that applications are developed on time,

far more effective than trying to implement a work-around after an application has been fully developed and deployed. This concept will be expanded upon in Chapter 4.

on budget, and with few security vulnerabilities. That narrow focus combined with the fact that application development has historically been done over a high-speed, low-latency LAN, means that the impact of the WAN on the performance of the appli-

Figure 3.1: Chatty Application

cation is generally not known until after the application is fully developed and deployed. In the majority of cases, there is at most a moderate emphasis during the design and development of an application on how well that application will run over a WAN. This lack of emphasis on how well an application will run over the WAN often results in the deployment of chatty applications as shown in Figure 3.1. A chatty application requires hundreds of application turns to complete a transaction. To exemplify the impact of a chatty protocol assume that a given transaction requires 200 application turns. Further assume that the latency on the LAN on which the application was developed was 1 millisecond, but that the round trip delay of the WAN on which the application will be deployed is 100 milliseconds. For simplicity, the delay associated with the data transfer will be ignored and only the delay associated with the application turns will be calculated. In this case, the delay

The preceding example also demonstrates the relationship between network delay and application delay. A relatively small increase in network delay can result a very significant increase in application delay.

Taxonomy of Applications The typical enterprise has tens and often hundreds of applications that transit the WAN. One way that these applications can be categorized is: 1. Business Critical A company typically runs the bulk of its key business functions utilizing a handful of applications. A company can develop these applications internally, buy them from a vendor such as Oracle or SAP, or acquire them from a software-as-a-service provider such as Salesforce.com. 2. Communicative and Collaborative

on the LAN is 200 milliseconds, which is not noticeable.

This includes delay sensitive applications such as

However, the delay on the WAN is 20 seconds, which is

Voice over IP and conferencing, as well as applica-

very noticeable.

tions that are less delay sensitive such as email.

The preceding example demonstrates the need to be

3. Other Data Applications

cognizant of the impact of the WAN on application per-

This category contains the bulk of a company’s data

formance during the application development lifecycle. In

applications. While these applications do not merit

particular, it is important during application development

the same attention as the enterprise’s business

to identify and eliminate any factor that could have a nega-

critical applications, they are important to the suc-

tive impact on application performance. This approach is

cessful operation of the enterprise.


9


4. IT Infrastructure-Related Applications This category contains applications such as DNS

Traffic Flow Considerations In many situations, the traffic flow on the data network

and DHCP that are not visible to the end user,

naturally follows a simple hub-and-spoke design.

but which are critical to the operation of the IT

example of this is a bank’s ATM network where the traffic

infrastructure.

flows from an ATM to a data center and back again. This

5. Recreational This category includes a growing variety of applications such as Internet radio, YouTube, stream-

An

type of network is sometimes referred to as a one-tomany network. A number of factors, however, cause the traffic flow in

ing news and multimedia, as well as music

a network to follow more of a mesh pattern. One factor

downloading.

is the wide spread deployment of Voice over IP (VoIP) 2.

6. Malicious This includes any application intended to harm the enterprise by introducing worms, viruses, spyware or other security vulnerabilities.

VoIP is an example of an application where traffic can flow between any two sites in the network. This type of network is often referred as an any-to-any network. An important relationship exists between VoIP deployment and MPLS deployment. MPLS is an any-to-any network.

Since they make different demands on the network,

As a result, companies that want to broadly deploy VoIP

another way to classify applications is whether the appli-

are likely to move away from a Frame Relay or an ATM

cation is real time, transactional or data transfer in orien-

network and to adopt an MPLS network. Analogously,

tation. For maximum benefit, this information must be

companies that have already adopted MPLS will find it

combined with the business criticality of the application.

easier to justify deploying VoIP.

For example, live Internet radio is real time but in virtually all cases it is not critical to the organization’s success. It is also important to realize an application such as Citrix Presentation Server or SAP is comprised of multiple modules with varying characteristics. Thus, it is not terribly meaningful to say that Citrix Presentation Server traffic is real time, transactional or data transfer in orientation. What is important is the ability to recognize application

Another factor affecting traffic flow is that many organizations require that a remote office have access to multiple data centers. This type of requirement could exist to enable effective disaster recovery or because the remote office needs to access applications that disparate data centers host. This type of network is often referred as a some-to-many network

traffic flows for what they are, for example a Citrix printing

Every component of an application delivery solu-

flow vs. editing a Word document.

tion has to be able to support the company’s

Successful application delivery requires that IT organizations are able to identify the applications running on the network and are also able to ensure the acceptable performance of the applications relevant to the business while controlling or eliminating applications that are not relevant.

traffic patterns, whether they are one-to-many, many-to-many, or some-to-many.

Webification of Applications The phrase Webification of Applications refers to the growing movement to implement Web-based user interfaces and to utilize chatty Web-specific protocols such as 2 2005/2006 VoIP State of the Market Report, Steven Taylor,

http://www.webtorials.com


10


HTTP. Similar to the definition of a chatty application, a protocol is referred to as being chatty if it requires tens if not hundreds of turns for a single transaction. In addition, XML is a dense protocol. That means communications based on XML consume more IT resources than communications that are not based on XML.

Data Center Consolidation and Single Hosting In addition to consolidating servers out of branch offices and into centralized data centers, many companies are also reducing the number of data centers they support worldwide. HP, for example, recently announced it was reducing the number of data centers it supports from

The webification of applications introduces chat-

85 down to six 3. This increases the distance between

ty protocols into the network. In addition, some

remote users and the applications they need to access.

or these protocols (e.g., XML) tend to greatly

Many companies are also adopting a single-hosting model

increase the amount of data that transits the

whereby users from all over the globe transit the WAN to

network and is processed by the servers.

access an application that the company hosts in just one

As will be discussed in Chapter 9, the dense nature of XML also creates some security vulnerabilities.

Server Consolidation

of its data centers. One of the effects of data center consolidation and single hosting is that it results in additional WAN latency for remote users.

Many companies either already have, or are in the process of consolidating servers out of branch offices and

Changing Application Delivery Model

into centralized data centers. This consolidation typically

The 80/20 rule in place until a few years ago stated that

reduces cost and enables IT organizations to have better

80% of a company’s employees were in a headquarters

control over the company’s data.

facility and accessed an application over a high-speed, low

While server consolidation produces many benefits, it can also produce some significant performance issues. Server consolidation typically results in chatty protocols such as CIFS (Common Internet File System), Exchange or NFS (Network File System), which were designed to run over the LAN, running over the WAN. The way that CIFS works is that it decomposes all files into smaller blocks prior to transmitting them. Assume that a client was attempting to open up a 20 megabyte file on a remote server. CIFS would decompose that file into hundreds, or possibly thousands of small data blocks. The server sends each of these data blocks to the client where it is verified and an acknowledgement is sent back to the server. The server must wait for an acknowledgement prior to sending the next data block. As a result, it can take several seconds for the user to be able to open up the file.

latency LAN. The new 80/20 rule states that 80% of a company’s employees access applications over a relatively low-speed, high latency WAN. In the vast majority of situations, when people access an application they are accessing it over the WAN.

Software as a Service

According to Wikipedia4, software as a service (SaaS)

is a software application delivery model where a software vendor develops a web-native software application and hosts and operates (either independently or through a third-party) the application for use by its customers over the Internet. Customers do not pay for owning the 3 Hewlett-Packard picks Austin for two data centers

http://www.statesman.com/business/content/business/stories/ other/05/18hp.html 4 http://en.wikipedia.org/wiki/Software_as_a_Service


11


software itself but rather for using it. They use it through

Only 14% of IT organizations claim to have

an API accessible over the Web and often written using

aligned the application delivery function with

Web Services. The term SaaS has become the indus-

the application development function. Eight

try preferred term, generally replacing the earlier terms

percent (8%) of IT organizations state they plan

Application Service Provider (ASP) and On-Demand.

and holistically fund IT initiatives across all of

There are many challenges associated with SaaS. For example, by definition of SaaS, the user accesses the application over the Internet and hence incurs all of the issues associated with the Internet. (See Chapter 6 for

the IT disciplines. Twelve percent (12%) of IT organizations state that troubleshooting an IT operational issues occurs cooperatively across all IT disciplines.

a discussion of the use of managed service providers as

The Industrial CIO described the current fractured, often

a way to mitigate some of the impact of the Internet.) In

defensive approach to application delivery. He has five IT

addition, since the company that uses the software does

disciplines that report directly to him. He stated that he

not own the software, they cannot change the software in

is tired of having each of them explain to him that their

order to make it perform better.

component of IT is fine and yet the company struggles to

Dynamic IT Environments The environment in which application delivery solutions are implemented is highly dynamic. For example, companies are continually changing their business processes and IT organizations are continually changing the network infrastructure.

In addition, companies regularly deploy

new applications and updates to existing applications. To be successful, application delivery solutions must function in a highly dynamic environment. This drives the need for both the dynamic setting of parameters and automation.

Fractured IT Organizations

provide customers an acceptable level of access to their Web site, book business and ship product. He also said that he and his peers do not care about the pieces that comprise IT, they care about the business results. The CYA approach to application delivery focuses on showing that it is not your fault that the application is performing badly. The goal of the CIO approach is to rapidly identify and fix the problem.

Application Complexity Companies began deploying mainframe computers in the late 1960s and mainframes became the dominant style of computing in the 1970s. The applications that were

The application delivery function consists of myriad sub-

written for the mainframe computers of that era were

specialties such as devices (e.g., desktops, laptops, point

monolithic in nature. Monolithic means that the applica-

of sale devices), networks, servers, storage, servers,

tion performed all of the necessary functions, such as

security, operating systems, etc. The planning and opera-

providing the user interface, the application logic, as well

tions of these sub-specialties are typically not well coordi-

as access to data.

nated within the application delivery function. In addition, market research performed in 2006 indicates that typically little coordination exists between the application delivery function and the application development function.

Most companies have moved away from deploying monolithic applications and towards a form of distributed computing that is often referred to as n-tier applications. Since these tiers are implemented on separate systems, WAN performance impacts n-tier applications more than


12


monolithic applications.

For example, the typical 3-tier

communication is outlined in Web Services Description

application is comprised of a Web browser, an application

Language (WSDL) documents.

server(s) and a database server(s). The information flow in

intended to serve as a guide to an IT organization’s Web

a 3-tier application is from the Web browser to the appli-

services. Unfortunately, they can also serve to guide secu-

cation server(s) and to the database, and then back again

rity attacks against the organization.

over the Internet using standard protocols such as HTTP or HTTPS.

These documents are

Assuming that a hacker has gained access to an organization’s WSDL document, the hacker can then begin to look

The movement to a Service-Oriented Architecture (SOA)

for vulnerabilities in the system. For example, by seeing

based on the use of Web services-based applications

how the system reacts to invalid data that the hacker has

represents the next step in the development of distributed

intentionally submitted, the hacker can learn a great deal

computing.

about the underlying technology and can use this knowl-

Just as WAN performance impacts n-tier applications more than monolithic applications, WAN performance impacts Web services-based applications significantly more than WAN performance impacts n-tier applications.

edge to further exploit the system. If the goal of the hacker is to create a denial of service attack or degrade application performance, the hacker could exploit the verbose nature of both XML and SOAP 5. When a Web services message is received, the first step the system takes is to read through, or parse, the elements of the message. As

To understand why the movement to Web services-

part of parsing the message, parameters are extracted and

based applications will drastically complicate the task of

content is inserted into databases. The amount of work

ensuring acceptable application performance, consider

required by XML parsing is directly affected by the size

the 3-tier application architecture that was previously dis-

of the SOAP message. Because of this, the hacker could

cussed. In a 3-tier application the application server(s) and

submit excessively large payloads that would consume an

the database server(s) typically reside in the same data

inordinate amount of system resources and hence degrade

center. As a result, the impact of the WAN is constrained

application performance.

to a single traffic flow, that being the flow between the user’s Web browser and the application server.

Chapter 9 will discuss some of the limitations of the current generation of firewalls. One of these limitations

In a Web services-based application, the Web services

is that the current generation of firewalls is not capable of

that comprise the application typically run on servers that

parsing XML. As such, these firewalls are blind to XML

are housed within multiple data centers. As a result, the

traffic. As part of providing security for Web services, IT

WAN impacts multiple traffic flows and hence has a great-

organizations need to be able to inspect XML and SOAP

er overall impact on the performance of a Web services-

messages and make intelligent decisions based on the

based application that it does on the performance of an

content of these messages. For example, IT organizations

n-tier application.

need to be able to perform anomaly detection in order to

Web Services and Security The expanding use of Web services creates some new security challenges. Part of this challenge stems from the fact that in most instances, the blueprint for Web services

distinguish valid messages from invalid messages. In addition, IT organizations need to be able to perform signature detection to detect the signature of known attacks. 5 Simple Object Access Protocol (SOAP) is the Web Services specification used for invoking methods on remote software components, using an XML vocabulary.


13


Web 2.0

years ago if somebody was starting a web based business they would need roughly one million dollars to get their

Defining Web 2.0 As was noted in the preceding section, the movement to a Service-Oriented Architecture (SOA) based on the use of Web services-based applications is going to drastically complicate the task of ensuring acceptable application performance. The same is true for the movement to Web 2.0. In the case of Web 2.0, however, the problem is exacerbated because most IT organizations are not aware of the performance issues associated with Web 2.0.

product to beta. Web 2.0 allows someone today to start up a business for fifty thousand dollars.” The Business Intelligence CTO said that this dramatic change is enabled in part because today businesses can hire programmers who use application platforms such as ASP.NET7 that rely on technologies such as AJAX (Asynchronous JavaScript and XML). Developers can use ASP.NET to quickly develop applications that run on low cost virtual servers 8 and communicate amongst themselves using Skype.

Many IT professionals view the phrase Web 2.0 as either just marketing hype that is devoid of any meaning or they associate it exclusively with social networking sites such as MySpace.

Another industry movement that is often associated with Web 2.0 is the deployment of Rich Internet Applications (RIA). In a traditional Web application all processing is done on the server, and a new Web page is downloaded

The Mobile Software CEO emphasized his view that

each time the user clicks.

In contrast, an RIA can be

Web 2.0 is “a lot more than just social networking”. He

viewed as “a cross between Web applications and tra-

said that the goal of Web 2.0 is to “allow for greater flex-

ditional desktop applications, transferring some of the

ibility for presenting information to the user.” The Mobile

processing to a Web client and keeping (some of) the

Software CEO added that Web 2.0 started with sites such

processing on the application server.”9 RIAs are created

as Google and MySpace and is now widely used as a way

using technologies such as Macromedia Flash, Flex, AJAX

to aggregate websites together more naturally.

and Microsoft’s Silverlight.

A key

component of Web 2.0 is that the content is “very dynamic and alive and that as a result people keep coming back to the website.” The concept of an application that is itself the result of aggregating other applications together has become so common that a new term, mashup, has been coined to describe it. According to Wikipedia 6 a mashup is a web application that combines data from more than one source into a single integrated tool; a typical example is the use of cartographic data from Google Maps to add location information to real-estate data from Craigslist, thereby creating a new and distinct service that was not originally envisaged by either source. The Business Intelligence CTO stated that when he thinks about Web 2.0 he doesn’t think about marketing hype. Instead he thinks about the new business opportunities that are a result of Web 2.0. He said that, “Ten 6 http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)

A recent publication10 quotes market research that indicates that by 2010 at least 60 percent of new application development projects will include RIA technology and that at least 25 percent will rely primarily on RIA technology. As stated in that publication, “This richer content is increasingly dynamic in nature, enabling an unprecedented level of interactivity and personalization. In real time, any consumer-specific information entered into these applications is passed back to the Web infrastructure to enable 7 ASP.NET is a web application framework marketed by Microsoft

that programmers can use to build dynamic web sites, web applications and XML web services. It is part of Microsoft’s .NET platform and is the successor to Microsoft’s Active Server Pages (ASP) technology, http://en.wikipedia.org/wiki/ASP.NET 8 Virtual servers will be discussed in more detail in Chapter 9 9 Wikipedia on Rich Internet Applications: http://en.wikipedia.org/wiki/Rich_Internet_Application 10 Web 2.0 is Here – Is Your Web Infrastructure Ready? http://www.akamai.com/html/forms/web20_postcard. html?campaign_id=1-13HGZ13


14


interaction, further personalization, and compelling marketing offers. For instance, consumers can be presented with geographic- and demographic-specific content, content that is tailored to preferences they indicate, surveys and contests, and constantly updated content such as stock quotes, sales promotions, and news feeds, to name a few.”

Quantifying Application Response Time As noted, Web 2.0 has some unique characteristics. In addition to a services focus, Web 2.0 characteristics include featuring content that is dynamic, rich and in many cases, user created. A model is helpful to illustrate the potential performance

Kubernan recently presented over 200 IT professionals with the following question: “Which of the following best describes your company’s approach to using new application architectures such as Services Oriented Architecture (SOA), Rich Internet Applications (RIA), or Web 2.0 applications including the use of mashups?” Their responses are shown in Table 3.1. Response

bottlenecks in any application environment in general, as well as in a Web 2.0 environment in particular. The following model is a variation of the application response time model created by Sevcik and Wetzel11. Like all models, the following is only an approximation and as a result it is not intended to provide results that are accurate to the millisecond level.

The model, however, is intended to

provide insight into the key factors that impact application Percentage of Respondents

response time. As shown below, the application response

Don’t use them

24.4%

Make modest use of them

37.2%

time (R) is impacted by amount of data being transmitted

Make significant use of them

11.7%

N/A or Don’t Know

24.4%

time (RTT), the number of application turns (AppTurns),

Other

2.2%

the number of simultaneous TCP sessions (concurrent

(Payload), the WAN bandwidth, the network round trip

requests), the server side delay (Cs) and the client side

Table 3.1: Current Use of New Application Architectures

The same group of IT professionals were then asked

delay (Cc).

to indicate how their company’s use of those application architectures would change over the next year.

Their

responses are shown in Table 3.2. Response No change is expected

Percentage of Respondents 23.3%

Figure 3.2: Application Response Time Model

The Branch Office Optimization Solutions that are

1.7%

described in Chapter 5 were designed primarily to deal

We will increase our use of these architectures

46.7%

N/A or Don’t Know

27.8%

with the size of the payload and the number of application

Other

0.6%

We will reduce our use of these architectures

Table 3.2: Increased Use of New Application Architectures

Emerging application architectures (SOA, RIA,

turns. The Application Front Ends that are described in Chapter 5 were designed primarily to offload communications processing from servers. They were not designed to offload any backend processing.

Web 2.0) have already begun to impact IT organizations and this impact will increase over the next year.

11 Why Centralizing Microsoft Servers Hurts Performance, Peter

Sevcik and Rebecca Wetzel, http://www.juniper.net/solutions/ literature/ms_server_centralization.pdf


15


The Web 2.0 Performance Issues

well. As The Mobile Software CEO sees it, IT organiza-

As noted, the existing network and application optimiza-

tions need to answer the question of “How will we scale

tion solutions were designed to mitigate the performance

Web 2.0 applications that have a rich amount of informa-

impacts of large payloads and multiple application turns.

tion from a dynamic database?” He said that a big part

Microprocessor vendors such as Intel and AMD continually

of the issue is that because of the dynamic content that

deliver products that increase the computing power that is

is associated with Web 2.0 applications, “caching is not

available on the desktop. As a result, these products minimize

caching – it is different for every single application that you

the delays that are associated with client processing (Cc).

work with”. As a result, IT organizations need to answer

This leaves just one element of the preceding model that has

questions such as: “When can I cache that data?” and

to be more fully accounted for – server side delay. This is the

“How do I keep that cache up to date?” He added that

critical performance bottleneck that has to be addressed in

the best way to solve the Web 2.0 performance problems

order for Web 2.0 applications to perform well.

is to deploy intelligent tools.

The existing generation of network and applica-

The Business Intelligence CTO pointed out that the most

tion optimization solutions does not deal with

important server side issue associated with traditional

a key requirement of Web 2.0 applications – the

applications was providing page views; while with Web 2.0

need to massively scale server performance.

applications it is supporting API calls. He emphasized that

The reason this is so critical is that unlike clients, servers suffer from scalability issues. In particular, servers have to support multiple users and each concurrent user consumes some amount of server resources: CPU, memory, I/O. Chris Loosley12 highlighted the scalability issues associated with servers

Loosley pointed out that activities

such as catalog browsing is a relatively fast and efficient activity that does not consume a lot of server resources. He contrasted that to an activity that required the server

“You can not scale a Web site just by throwing servers at it. That buys you time, but it does not solve the problem.” His recommendation was that IT organizations should make relatively modest investments in servers and make larger investments in tools to accelerate the performance of applications.

Planning Introduction

to update something, such as clicking a button to add an

The classic novel Alice in Wonderland by the English

item to a shopping cart. His points out that activities such

mathematician Lewis Carroll first explained part of the

as updating consumes significant server resources and so

need for the planning component of the application deliv-

the number of concurrent transactions, server interactions

ery framework. In that novel Alice asked the Cheshire cat,

that update a customer’s stored information, plays a critical

“Which way should I go?” The cat replied, “Where do

role in determining server performance.

you want to get to?” Alice responded, “I don’t know,” to

The Mobile Software CEO addressed the issue of scalability when he stated that there is no better application

which the cat said, “Then it doesn’t much matter which way you go.”

framework than ASP.NET, but that ASP.NET does make

Relative to application performance, most IT organiza-

it very easy to develop applications that do not perform

tions are somewhat vague on where they want to go. In

12 Rich Internet Applications: Design, Measurement and Manage-

particular, only 38% of IT organizations have established

ment Challenges, Chris Loosley, http://www.keynote.com/docs/ whitepapers/RichInternet_5.pdf


16


well-understood performance objectives for their company’s business-critical applications. It is extremely difficult to make effective network and application design decisions if the IT organization does not have targets for application performance that are well understood and adhered to. One primary factor driving the planning component of application delivery is the need for risk mitigation. One manifestation of this factor is the situation in which a company’s application development function has spent millions

• Identify the impact of a change to the network, the servers, or to an application. • Create a network design that maximizes availability and minimizes latency. • Create a data center architecture that maximizes the performance of all of the resources in the data center. • Choose appropriate network technologies and services. • Determine what functionality to perform internally

of dollars to either develop or acquire a highly visible, busi-

and what functionality to acquire from a third party.

ness critical application. The application delivery function

This topic will be expanded upon in Chapter 6.

must take the proactive steps this section will describe in order to protect both the company’s investment in the application as well as the political capital of the application delivery function. Hope is not a strategy. Successful application delivery requires careful planning, coupled with extensive measurements and effective proactive and reactive processes.

WAN Emulation Chapter 3 outlined some of the factors that increase the difficulty of ensuring acceptable application performance. One of these factors is the fact that in the vast majority of situations, the application development process does not take into account how well the application will run over a WAN. One class of tool that can be used to test and profile

Planning Functionality Many planning functions are critical to the success of application delivery. They include the ability to: Profile an application prior to deploying it, including running it in conjunction with a WAN emulator to replicate the performance experienced in branch offices. • Baseline the performance of the network. • Perform a pre-deployment assessment of the IT infrastructure. • Establish goals for the performance of the network and for at least some of the key applications that transit the network. • Model the impact of deploying a new application.

application performance throughout the application lifecyle is a WAN emulator. These tools are used during application development and quality assurance (QA) and serve to mimic the performance characteristics of the WAN; e.g., delay, jitter, packet loss. One of the primary benefits of these tools is that application developers and QA engineers can use them to quantify the impact of the WAN on the performance of the application under development, ideally while there is still time to modify the application. One of the secondary benefits of using WAN emulation tools is that over time the application development groups come to understand how to write applications that perform well over the WAN. Table 4.1, for example, depicts the results of a lab test that was done using a WAN emulator to quantify the affect


17


that WAN latency would have on an inquiry-response application that has a target response time of 5 seconds. Similar tests can be run to quantify the affect that jitter and packet loss have on an application. Network Latency

One obvious conclusion that can be drawn from table 4.2 is: The vast majority of IT organizations see significant value from a tool that can be used to test

Measured Response Time

application performance throughout the applica-

0 ms

2 seconds

25 ms

2 seconds

50 ms

2 seconds

The flawed application development process is just one

75 ms

2 seconds

100 ms

4 seconds

of the factors that Chapter 3 identified that increase the

125 ms

4 seconds

150 ms

12 seconds

tion lifecyle.

difficulty of ensuring acceptable application performance. Other factors include the consolidation of IT resources and the deployment of demanding applications such as VoIP.

Table 4.1: Impact of Latency on Application Performance

As Table 4.1 shows, if there is no WAN latency the appli-

IT organizations will not be regarded as suc-

cation has a two-second response time. This two-second

cessful if they do not have the capability to both

response time is well within the target response time

develop applications that run well over the WAN

and most likely represents the time spent in the applica-

and to also plan for changes such as data center

tion server or the database server. As network latency is

consolidation and the deployment of VoIP.

increased up to 75 ms., it has little impact on the application’s response time. If network latency is increased above 75 ms, the response time of the application increases rapidly and is quickly well above the target response time. Over 200 IT professionals were recently asked “Which of the following describes your company’s interest in a tool that can be used to test application performance throughout the application lifecyle – from application design through ongoing management?” The survey respondents were allowed to indicate multiple answers. Their responses are depicted in Table 4.2. Response

This follows because as previously stated, hope is not a strategy. IT organizations need to be able to first anticipate the issues that will arise as a result of a major change and then take steps to mitigate the impact of those issues. Whenever an IT organization is considering implementing a tool of this type it is important to realize that the ultimate goal of these tools is to provide insight and not an undo level of precision. In particular, IT environments are complex and dynamic. As a result, it can be extremely difficult and laborious to have the tool accurately represent every aspect of the IT environment. In addition, even if

Percentage of Respondents

the tool could accurately represent every aspect of the IT environment at some point in time, that environment will

If the tool worked well it would make a significant improvement to our ability to manage application performance

71%

The output of tools like this is generally not that helpful

9%

Tools like this tend to be too difficult to use, particularly during application development

13%

ment, a valid use of a WAN emulation tool is to provide

Our applications developers would be resistant to using such a tool

11%

insight into what happens if WAN delay increases from

Our operations groups lack the application specific skills to use a tool like this

17%

change almost immediately and that representation would no longer be totally accurate.

Table 4.2: Interest in an Application Lifecycle Management Tool

Given the complex and dynamic nature of the IT environ-

70 ms to 100 ms.. For example, would that increase the application delay by a second? By two seconds? By five seconds? It is reasonable to demand that the WAN emula-


18


tion tool provide accurate insight. For example, it is rea-

profile an application in a reactive fashion. That means the

sonable to demand that if the tool indicates that a 30 ms.

organization profiled the application only after users com-

increase in WAN delay results in a 2 second increase in

plained about its performance.

application delay, that indeed that is correct. It is not reasonable, however, to expect that the tool would be able to determine whether a 30 ms. increase in WAN delay would increase application delay by 4.85 seconds vs. increasing it by 4.90 seconds. One of the reasons why IT organizations should not

Alternatively, some IT organizations only profile an application shortly before they deploy it. The advantages of this approach are that it helps the IT organization: • Identify minor changes that can be made to the application that will improve its performance.

expect a level of undo precision from a WAN emula-

• Determine if some form of optimization technology

tion tool has already been discussed – the complex and

will improve the performance of the application.

dynamic nature of the IT environment. Another reason is the inherent nature of any modeling or simulation tool. One of the key characteristics of these tools is that they typically contain a slippery slope of complexity. By that is

• Identify the sensitivity of the application to parameters such as WAN latency and use this information to set effective thresholds.

meant that when creating a simulation tool, a great deal of

• Gather information on the performance of the appli-

insight can be provided without having the tool be unduly

cation that can be used to set the expectations of

complex. The 80/20 rule applies here: 80% of the insight

the users.

can be provided while only incurring 20% of the complexity. However, in order to add additional insight requires the tool to become very complex and typically require a level of granular input that either does not exist or is incredible time consuming to create. The data in Table 4.2 indicates that IT professionals are

• Learn about the factors that influence how well an application will run over a WAN. Since companies perform these tests just before they put the application into production, this is usually too late to make any major change.

well aware of the fact that many of these tools are unac-

The application delivery function needs to be

ceptably complex. In particular, while the survey respon-

involved early in the applications development

dents indicated a strong interest in these tools, thirty

cycle.

percent of the survey respondents indicated either that tools like this tend to be difficult or that their operations group would not have the skills necessary to use a tool like this.

The Automotive Network Engineer provided insight into the limitations of testing an application just prior to deployment. He stated that relative to testing applications just prior to putting them into production, “We are required

In the vast majority of cases, a tool that is undu-

to go through a lot of hoops.” He went on to say that

ly complex is of no use to an IT organization.

sometimes the testing was helpful, but that if the applica-

The preceding discussion of using a WAN emulator to either develop more efficient applications or to quantify the impact of a change such as a data center initiative is a proactive use of the tool. In many cases, IT organizations

tion development organization was under a lot of management pressure to get the application into production, that the application development organization often took the approach of deploying the application and then dealing with the performance problems later.


19


The Consulting Architect pointed out that his organiza-

The Team Leader stated that his organization does not

tion is creating an architecture function. A large part of the

baseline the company’s entire global network. They have,

motivation for the creation of this function is to remove

however, widely deployed two tools that assist with base-

the finger pointing that goes on between the network and

lining. One of these tools establishes trends relative to their

the application-development organizations. One goal of

traffic. The other tool baselines the end-to-end responsive-

the architecture function is to strike a balance between

ness of applications. The Team Leader has asked the two

application development and application delivery.

For

vendors to integrate the two tools so that he will know

example, there might be good business and technical fac-

how much capacity he has left before the performance of

tors drive the application development function to develop

a given application becomes unacceptable.

an application using chatty protocols. One role of the architecture group is to identify the effect of that decision on the application-delivery function and to suggest solutions. For example, does the decision to use chatty protocols mean that additional optimization solutions would have to be deployed in the infrastructure? If so, how well will the application run if an organization deploys these optimization solutions? What additional management and security issues do these solutions introduce?

The Key Steps Four primary steps comprise baselining. They are: I. Identify the Key Resources Most IT organizations do not have the ability to baseline all of their resources. These organization must determine which are the most important resources and baseline them. One way to determine which resources are the most important is to

A primary way to balance the requirements and

identify the company’s key business applications

capabilities of the application development and

and to identify the IT resources that support these

the application-delivery functions is to create an

applications.

effective architecture that integrates those two functions.

II. Quantify the Utilization of the Assets over a Sufficient Period of Time

Baselining

Organizations must compute the baseline over a

Introduction

responses times for a CRM application might be dif-

Baselining provides a reference from which service

normal business cycle. For example, the activity and ferent at 8:00 a.m. on a Monday than at 8:00 p.m.

quality and application delivery effectiveness can be mea-

on a Friday. In addition, the activity and response

sured. It does this by quantifying the key characteristics

times for that CRM application are likely to differ

(e.g.., response time, utilization, delay) of applications and

greatly during a week in the middle of the quarter

various IT resources including servers, WAN links and rout-

as compared with times during the last week of the

ers. Baselining allows an IT organization to understand the

quarter.

normal behavior of those applications and IT resources. Baselining is an example of a task that one can regard as a building block of management functionality. That means baselining is a component of several key processes, such as performing a pre-assessment of the network prior to deploying an application or performing proactive alarming.

In most cases, baselining focuses on measuring the utilization of resources, such as WAN links. However, application performance is only indirectly tied to the utilization of WAN links. Application performance is tied directly to factors such as WAN


20


delay. Since it is often easier to measure utilization

tial to miss important behavior that is both infrequent and

than delay, many IT organizations set a limit on the

anomalous.

maximum utilization of their WAN links hoping that this will result in acceptable WAN latency. IT organizations need to modify their baselining activities to focus directly on delay. III. Determine how the Organization Uses Assets This step involves determining how the assets are being consumed by answering questions such as: Which applications are the most heavily used? Who is using those applications? How has the usage of those applications changed? In addition to being a key component of baselining, this step also positions the application- delivery function to provide the company’s business and functional managers insight into how their organizations are changing based on how their use of key applications is changing. IV. Utilize the Information The information gained from baselining has many uses. This includes capacity planning, budget planning and chargeback. Another use for this information is to measure the performance of an application before and after a major change, such as a server upgrade, a network redesign or the implementation of a patch. For example, assume that a company is going to upgrade all of its Web servers. To ensure they get all of the benefits they expect from that upgrade, that company should measure key parameters both before and after the upgrade. Those parameters include WAN and server delay as well as the end-to-end application response time as experienced by the users. An IT organization can approach baselining in multiple ways. Sampling and synthetic approaches to baselining can leave a number of gaps in the data and have the poten-

Organizations should baseline by measuring 100% of the actual traffic from the real users.

Selection Criteria The following is a set of criteria that IT organizations can use to choose a baselining solution. For simplicity, the criteria are focused on baselining applications and not other IT resources.

Application Monitoring To what degree (complete, partial, none) can the solution identify: • Well-known applications; e.g., e-mail, VoIP, Oracle, PeopleSoft. • Custom applications. • Complex applications; e.g., Microsoft Exchange, SAP R/3, Citrix Presentation Server. • Web-based applications, including URL-by-URL tracking. • Peer-to-peer applications. • Unknown applications.

Application Profiling and Response Time Analysis Can the solution: • Provide response time metrics based on synthetic traffic generation? • Provide response time metrics based on monitoring actual traffic? • Relate application response time to network activity? • Provide application baselines and trending?


21


Pre-Deployment Assessment

The Team Leader stated his organization determines

The goal of performing a pre-deployment assessment of

whether to perform a network assessment prior to deploy-

the current environment is to identify any potential prob-

ing a new application on a case-by-case basis. In particular,

lems that might affect an IT organization’s ability to deploy

he pointed out that it tends to perform an assessment if

an application. One of the two key questions that an orga-

it is a large deployment or if it has some concerns about

nization must answer during pre-deployment assessment

whether the infrastructure can support the application.

is: Can the network provide appropriate levels of security

To assist with this function, his organization has recently

to protect against attacks? As part of a security assess-

acquired tools that can help it with tasks such as assessing

ment, it is important review the network and the attached

the ability of the infrastructure to support VoIP deployment

devices and to document the existing security functionality

as well as evaluating the design of their MPLS network.

such as IDS (Intrusion Detection System), IPS (Intrusion Prevention System) and NAC (Network Access Control). The next step is to analyze the configuration of the network elements to determine if any of them pose a security risk. It is also necessary to test the network to see how it responds to potential security threats.

The Engineering CIO said that the organization is deploying VoIP. As part of that deployment, it did an assessment of the ability of the infrastructure to support VoIP. The assessment was comprised of an analysis using an excel spreadsheet. The organization identified the network capacity at each office, the current utilization of that capac-

The second key question that an organization must

ity and the added load that would come from deploying

answer during pre-deployment assessment is: Can the

VoIP. Based on this set of information, it determined where

network provide the necessary levels of availability and

it needed to add capacity.

performance? As previously mentioned, it is extremely difficult to answer questions like this if the IT organization does not have targets for application performance that

The key components of a pre-deployment network assessment are:

are well understood and adhered to. It is also difficult to

Create an inventory of the applications run-

answer this question, because as Chapter 3 described,

ning on the network

the typical application environment is both complex and

This includes discovering the applications that are

dynamic. Organizations should not look at the process of performing a pre-deployment network assessment in isolation. Rather, they should consider it part of an applicationlifecycle management process that includes a comprehensive assessment and analysis of the existing network; the development of a thorough rollout plan including: the profiling of the application; the identification of the impact of implementing the application; and the establishment of effective processes for ongoing fact-based data management.

running on the network. Chapter 7 will discuss this task in greater detail. In addition to identifying the applications that are running on the network, it is also important to categorize those applications using an approach similar to what Chapter 3 described. Part of the value of this activity is to identify recreational use of the network; i.e., on-line gaming and streaming radio or video. Blocking this recreational use can free up additional WAN bandwidth. Chapter 7 quantifies the extent to which corporate networks are carrying recreational traffic.


22


Another part of the value of this activity is to

to management data from SNMP MIBs (Simple Network

identify business activities, such as downloads

Management Protocol Management Information Bases)

of server patches or security patches to desk-

on network devices, such as switches and routers. This

tops that are being performed during peak times.

data source provides data link layer visibility across the

Moving these activities to an off-peak time adds

entire enterprise network and captures parameters, such

additional bandwidth.

as the number of packets sent and received, the number

Evaluate bandwidth to ensure available capacity for new applications This activity involves baselining the network as previously described. The goal is to use the information about how the utilization of the relevant network resources has been trending to identify if any parts of the network need to be upgraded to support the new application.

of packets that are discarded, as well as the overall link utilization. NetFlow is a Cisco IOS software feature and also the name of a Cisco protocol for collecting IP traffic information. Within NetFlow, a network flow is defined as a unidirectional sequence of packets between a given source and destination. The branch office router outputs a flow record after it determines that the flow is finished. This record contains information, such as timestamps for the

As previously described, baselining typically refers

flow start and finish time, the volume of traffic in the flow,

to measuring the utilization of key IT resources.

and its source and destination IP addresses and source

The recommendation was made that companies

and destination port numbers.

should modify how they think about baselining to focus not on utilization, but on delay. In some instances, however, IT organizations need to measure more than just delay. If a company is about to deploy VoIP, for example, then the pre-assessment baseline must also measure the current levels of jitter and packet loss, as VoIP quality is highly sensitive to those parameters. Create response time baselines for key essential applications

NetFlow represents a more advanced source of management data than SNMP MIBs. For example, whereas data from standard SNMP MIB monitoring can be used to quantify overall link utilization, this class of management data can be used to identify which network users or applications are consuming the bandwidth. The IETF is in the final stages of approving a standard (RFC 3917) for logging IP packets as they flow through a router, switch or other networking device and reporting that information to network management and accounting

This activity involves measuring the average and

systems. This new standard, which is referred to as IPFIX

peak application response times for key applica-

(IP Flow Information EXport), is based on NetFlow Version

tions both before and after the new application is

9.

deployed. This data will allow IT organizations to determine if deploying the new application causes an unacceptable impact on the company’s other key applications.

An important consideration for IT organizations is whether they should deploy vendor-specific, packet inspection-based dedicated instrumentation. The advantage of deploying dedicated instrumentation is that it enables a

As part of performing a pre-deployment network assess-

more detailed view into application performance. The dis-

ment, IT organizations can typically rely on having access

advantage of this approach is that it increases the cost of


23


the solution. A compromise is to rely on data from SNMP MIBs and NetFlow in small sites and to augment this with dedicated instrumentation in larger, more strategic sites. Another consideration is whether or not IT organizations should deploy software agents on end systems. One of the architectural advantages of this approach is that it monitors performance and events closer to the user’s actual experience. A potential disadvantage of this approach is that there can be organizational barriers that limit the ability of the IT organization to put software on each end system. In addition, for an agent-based approach to be successful, it must not introduce any appreciable management overhead.

• Ensure that the WAN link is never idle if there is data to send. • Reduce the number of round trips (a.k.a., transport layer or application turns) that are necessary for a given transaction. • Mitigate the inefficiencies of older protocols. • Offload computationally intensive tasks from client systems and servers There are two principal categories of network and application optimization products. One category focuses on the negative effect of the WAN on application performance. This category is often referred to as a WAN

Whereas gaining access to management data is relatively

optimization controller (WOC) but will also be referred to

easy, collecting and analyzing details on every application

in this handbook as Branch Office Optimization Solutions.

in the network is challenging. It is difficult, for example, to

Branch Office Optimization Solutions are often referred

identify every IP application, host and conversation on the

to as symmetric solutions because they typically require

network as well as applications that use protocols such

an appliance in both the data center as well as the branch

as IPX or DECnet. It is also difficult to quantify applica-

office. Some vendors, however, have implemented solu-

tion response time and to identify the individual sources

tions that call for an appliance in the data center, but do

of delay; i.e., network, application server, database. One

not require an appliance in the branch office. This class of

of the most challenging components of this activity is to

solution is often referred to as a software only solution.

unify this information so the organization can leverage it to support myriad activities associated with managing application delivery.

The trade-off between a traditional symmetric solution based on an appliance and a software only solution is straightforward. Because the traditional symmetric solu-

Network and Application Optimization

tion involves an appliance in each branch office, it has the

Introduction

appliance in each branch office, a traditional symmetric

The phrase network and application optimization refers to an extensive set of techniques that organizations have deployed in an attempt to optimize the performance of networks and applications as part of assuring acceptable application performance. The primary role that these techniques play is to: • Reduce the amount of data that is sent over the WAN.

dedicated hardware that allows it to service a large user base. However, because of the requirement to have an solution also tends to be more expensive. As a result, the software only solution is most appropriate for individual users or small offices. Note that while a software only solution can not typically match the performance of a symmetric solution, that does not mean that a software only solution is less functional than a symmetric solution. IT organizations that are looking for a software only solution should expect that the solution will provide a rich set of functionality; i.e., Layer 3 and 4 visibility and shaping,


24


Layer 7 visibility and shaping, packet marking based on

Companies deploy Branch Office Optimization Solutions

DSCP (DiffServ code point), as well as sophisticated analy-

and AFEs in different ways.

sis and reporting.

example, has many more branch offices than data centers.

The typical software only solution is comprised of: • Agents that sit on each PC and which serve to

The typical company, for

Hence, the question of whether to deploy a solution in a limited tactical manner vs. a broader strategic manner applies more to Branch Office Optimization Solutions than

monitor and shape WAN application and user traffic

it does to AFEs. Also, AFEs are based on open standards

in accordance with assigned policy.

and as a result a company can deploy AFEs from different

• A PC or server that has two functions. One function is to serve as a collector of network statistics. The other function is to store policies that are accessed by the agents. • A management console that is used for monitoring, policy development and management. The second category of product that will be discussed in this Chapter is often referred to as an Application Front End (AFE) or Application Device Controller (ADC). This solution is typically referred to as being an asymmetric solution because an appliance is only required in the data center and not the branch office. The genesis of this category of solution dates back to the IBM mainframe-computing model of the late 1960s and early 1970s. Part of

vendors and not be concerned about interoperability. In contrast, Branch Office Optimization Solutions are based on proprietary technologies and so a company would tend to choose a single vendor from which to acquire these solutions.

Alice in Wonderland Revisited Chapter 4 began with a reference to Alice in Wonderland and discussed the need for IT organizations to set a direction for things such as application performance.

That

same reference to Alice in Wonderland applies to the network and application optimization component of application delivery. In particular, no network and application optimization solution on the market solves all possible application performance issues.

that computing model was to have a Front End Processor

To deploy the appropriate network and applica-

(FEP) reside in front of the IBM mainframe. The primary

tion optimization solution, IT organizations need

role of the FEP was to free up processing power on the

to understand the problem they are trying to

general purpose mainframe computer by performing com-

solve.

munications processing tasks, such as terminating the 9600 baud multi-point private lines, in a device that was designed just for these tasks. The role of the AFE is somewhat similar to that of the FEP in that the AFE performs computationally intensive tasks, such as the processing of SSL (Secure Sockets Layer) traffic, and hence frees up server resources. However, another role of the AFE is to function as a Server Load Balancer (SLB) and, as the name implies, balance traffic over multiple servers. While performing these functions accelerates the performance of Web-based applications, AFEs often do not accelerate the performance of standard Windows based applications.

Chapter 3 of this handbook described some of the characteristics of a generic application environment and pointed out that to choose an appropriate solution, IT organizations need to understand their unique application environment. In the context of network and application optimization, if the company either already has or plans to consolidate servers out of branch offices and into centralized data centers, then as described later in this section, a WAFS (Wide Area File Services) solution might be appropriate.

If the company is implementing VoIP, then any

Branch Office Optimization Solution that it implements


25


must be able to support traffic that is both real-time and meshed, and have strong QoS functionality. Analogously, if the company is making heavy use of SSL, it might make sense to implement an AFE to relieve the servers of the burden of processing the SSL traffic.

Branch Office Optimization Solutions Background The goal of Branch Office Optimization Solutions is to improve the performance of applications delivered from the data center to the branch office or directly to the end

In addition to high-level factors of the type the preceding

user. Myriad techniques comprise branch office optimiza-

paragraph mentioned, the company’s actual traffic pat-

tion solutions. Table 5.1 lists some of these techniques

terns also have a significant impact on how much value a

and indicates how organizations can use each of these

network and application optimization solution will provide.

techniques to overcome some characteristic of the WAN

To exemplify this, consider the types of advanced com-

that impairs application performance.

pression most solution providers offer. The effectiveness of advanced compression depends on two factors. One

WAN Characteristic

WAN Optimization Techniques

Insufficient Bandwidth

Data Reduction: • Data Compression • Differencing (a.k.a., de-duplication) • Caching

High Latency

Protocol Acceleration: • TCP • HTTP • CIFS • NFS • MAPI Mitigate Round-trip Time • Request Prediction • Response Spoofing

Packet Loss

Congestion Control Forward Error Correction (FEC)

with a lot of redundancy, such as text and html on web

Network Contention

Quality of Service (QoS)

pages, will benefit significantly from advanced compres-

Table 5.1: Techniques to Improve Application Performance

factor is the quality of the compression techniques that have been implemented in a solution. Since many compression techniques use the same fundamental and widely known mathematical and algorithmic foundations, the performance of many of the solutions available in the market will tend to be somewhat similar. The second factor that influences the effectiveness of advanced compression solutions is the amount of redundancy of the traffic. Applications that transfer data

sion.

Applications that transfer data that has already

been compressed, such as the voice streams in VoIP or jpg-formatted images, will see little improvement in performance from implementing advanced compression and could possibly see performance degradation.

Below is a brief description of some of the principal WAN optimization techniques.

Caching This refers to keeping a local copy of information with the

Because a network and optimization solution will provide

goal of either avoiding or minimizing the number of times

varying degrees of benefit to a company based on the

that information must be accessed from a remote site. As

unique characteristics of its environment, third party tests

described below, there are multiple forms of caching.

of these solutions are helpful, but not conclusive. In order to understand the performance gains of any network and application optimization solution, that solution must be tested in an environment that closely reflects the environment in which it will be deployed.

Byte Caching With byte caching the sender and the receiver maintain large disk-based caches of byte strings previous sent and received over the WAN link. As data is queued for the WAN, it is scanned for byte


26


strings already in the cache. Any strings that result

to remove the redundancy, creating a smaller

in cache hits are replaced with a short token that

file. A number of familiar lossless compression

refers to its cache location, allowing the receiver

tools for binary data are based on Lempel-Ziv (LZ)

to reconstruct the file from its copy of the cache.

compression. This includes zip, PKZIP and gzip

With byte caching, the data dictionary can span

algorithms.

numerous TCP applications and information flows rather than being constrained to a single file or single application type. Object Caching Object caching stores copies of remote application objects in a local cache server, which is generally on the same LAN as the requesting system. With object caching, the cache server acts as a proxy for a remote application server. For example, in Web object caching, the client browsers are configured to connect to the proxy server rather than directly to the remote server. When the request

LZ develops a codebook or dictionary as it processes the data stream and builds short codes corresponding to sequences of data.

Repeated

occurrences of the sequences of data are then replaced with the codes. The LZ codebook is optimized for each specific data stream and the decoding program extracts the codebook directly from the compressed data stream. LZ compression can often reduce text files by as much as 60-70%. However, for data with many possible data values LZ may prove to be quite ineffective because repeated sequences are fairly uncommon.

for a remote object is made, the local cache is que-

Differential Compression; a.k.a., Differencing

ried first. If the cache contains a current version

or De-duplication

of the object, the request can be satisfied locally at LAN speed and with minimal latency. Most of the latency involved in a cache hit results from the cache querying the remote source server to ensure that the cached object is up to date.

Differencing algorithms are used to update files by sending only the changes that need to be made to convert an older version of the file to the current version.

Differencing algorithms partition a file

into two classes of variable length byte strings:

If the local proxy does not contain a current version

those strings that appear in both the new and old

of the remote object, it must be fetched, cached,

versions and those that are unique to the new ver-

and then forwarded to the requester. Loading the

sion being encoded. The latter strings comprise

remote object into the cache can potentially be facil-

a delta file, which is the minimum set of changes

itated by either data compression or byte caching.

that the receiver needs in order to build the updat-

Compression The role of compression is to reduce the size of a file prior to transmitting that file over a WAN. As described below, there are various forms of compression. Static Data Compression Static data compression algorithms find redundancy in a data stream and use encoding techniques

ed version of the file. While differential compression is constrained to those cases where the receiver has stored an earlier version of the file, the degree of compression is very high. As a result, differential compression can greatly reduce bandwidth requirements for functions such as software distribution, replication of distributed file systems, and file system backup and restore.


27


Figure 5.1: Protocol Acceleration Appliances

Real Time Dictionary Compression The same basic LZ data compression algorithms discussed earlier can also be applied to individual blocks of data rather than entire files. Operating at the block level results in smaller dynamic dictionaries that can reside in memory rather than on disk. As a result, the processing required for compression and decompression introduces only a small amount of delay, allowing the technique to be applied to real-time, streaming data. Congestion Control The goal of congestion control is to ensure that the sending device does not transmit more data than the network can accommodate. To achieve this goal, the TCP congestion control mechanisms are based on a parameter referred to as the congestion window. TCP has multiple mechanisms to determine the congestion window. Forward Error Correction (FEC) FEC is typically used at the physical layer (Layer 1) of the OSI stack. FEC can also be applied at the network layer (Layer 3) whereby an extra packet is transmitted for every n packets sent. This extra packet is used to recover from an error and hence avoid having to retransmit packets.

A subsequent section of the handbook will discuss some of the technical challenges associated with data replication and will describe how FEC mitigates some of those challenges. Protocol Acceleration Protocol acceleration refers to a class of techniques that improves application performance by circumventing the shortcomings of various communication protocols. Protocol acceleration is typically based on per-session packet processing by appliances at each end of the WAN link, as shown in Figure 5.1. The appliances at each end of the link act as a local proxy for the remote system by providing local termination of the session. Therefore, the end systems communicate with the appliances using the native protocol, and the sessions are relayed between the appliances across the WAN using the accelerated version of the protocol or using a special protocol designed to address the WAN performance issues of the native protocol. As described below, there are many forms of protocol acceleration. TCP Acceleration TCP can be accelerated between appliances with a variety of techniques that increase a session’s ability


28


to more fully utilize link bandwidth. Some of the available techniques are dynamic scaling of the window size, packet aggregation, selective acknowledgement, and TCP Fast Start. Increasing the window size for large transfers allows more packets to be simultaneously in transit boosting bandwidth utilization. With packet aggregation, a number of smaller packets are aggregated into a single larger packet, reducing the overhead associated with numerous small packets.

TCP selective acknowledgment (SACK)

improves performance in the event that multiple packets are lost from one TCP window of data. With SACK, the receiver tells the sender which packets in the window were received, allowing the sender to retransmit only the missing data segments instead of all segments sent since the first lost packet. TCP slow start and congestion avoidance lower the data throughput drastically when loss is detected. TCP Fast Start remedies this by accelerating the growth of the TCP window size to quickly take advantage of link bandwidth.

Web pages are often composed of many separate objects, each of which must be requested and retrieved sequentially. Typically a browser will wait for a requested object to be returned before requesting the next one. This results in the familiar ping-pong behavior that amplifies the effects of latency. HTTP can be accelerated by appliances that use pipelining to overlap fetches of Web objects rather than fetching them sequentially. In addition, the appliance can use object caching to maintain local storage of frequently accessed web objects. Web accesses can be further accelerated if the appliance continually updates objects in the cache instead of waiting for the object to be requested by a local browser before checking for updates. Microsoft Exchange Acceleration Most of the storage and bandwidth requirements of email programs, such as Microsoft Exchange, are due to the attachment of large files to mail messages. Downloading email attachments from remote Microsoft Exchange

CIFS and NFS Acceleration As mentioned earlier, CIFS and NFS use numerous Remote Procedure Calls (RPCs) for each file sharing operation. NFS and CIFS suffer from poor performance over the WAN because each small data block must be acknowledged before the next one is sent. This results in an inefficient ping-pong effect that amplifies the effect of WAN latency. CIFS and NFS file access can be greatly accelerated by using a WAFS transport protocol between the acceleration appliances.

HTTP Acceleration

With the WAFS protocol,

when a remote file is accessed, the entire file can be moved or pre-fetched from the remote server to the local appliance’s cache. This technique eliminates numerous round trips over the WAN. As a result, it can appear to the user that the file server is local rather than remote. If a file is being updated, CIF and NFS acceleration can use differential compression and block level compression to further increase WAN efficiency.

Servers is slow and wasteful of WAN bandwidth because the same attachment may be downloaded by a large number of email clients on the same remote site LAN. Microsoft Exchange acceleration can be accomplished with a local appliance that caches email attachments as they are downloaded.

This means that all subsequent

downloads of the same attachment can be satisfied from the local application server.

If an attachment is edited

locally and then returned to via the remote mail server, the appliances can use differential file compression to conserve WAN bandwidth. Request Prediction By understanding the semantics of specific protocols or applications, it is often possible to anticipate a request a user will make in the near future. Making this request in advance of it being needed eliminates virtually all of the delay when the user actually makes the request.


29


Many applications or application protocols have a wide

techniques was to solve a particular problem supports that

range of request types that reflect different user actions

position. He also stated that his company is “absolutely

or use cases. It is important to understand what a vendor

becoming more proactive moving forward with deploying

means when it says it has a certain application level opti-

these techniques.”

mization. For example, in the CIFS (Windows file sharing) protocol, the simplest interactions that can be optimized involve drag and drop. But many other interactions are more complex. Not all vendors support the entire range of CIFS optimizations.

Similarly, The Motion Picture Architect commented that his organization has been looking at these technologies for a number of years, but has only deployed products to solve some specific problems, such as moving extremely large files over long distances. He noted that his organization now wants to deploy products proactively to solve a broad-

Request Spoofing This refers to situations in which a client makes a request of a distant server, but the request is responded to locally.

er range of issues relative to application performance. According to The Motion Picture Architect, “Even a well written application does not run well over long distances. In order to run well, the application needs to be very thin and it is very difficult to write a full featured application

Tactical vs. Strategic Solutions To put the question of tactical vs. strategic in context,

that is very thin.”

refer again to the IT organization that Chapter 2 of this

IT organizations often start with a tactical

handbook referenced. For that company to identity the

deployment of WOCs and expand this deploy-

problem that it is trying to solve, it must answer questions

ment over time.

such as: Is the problem just the performance of this one application as used just by employees in the Pac Rim? If that is the problem statement, then the company is looking for a very tactical solution. However, the company might

Current Deployments Table 5.2 depicts the extent of the deployment of branch office optimization solutions.

decide that the problem that it wants to solve is how can tions for all of its employees under as wide a range of cir-

No plans to deploy

Have not deployed, but plan to deploy

Deployed in test mode

Limited production deployment

Broadly deployed

cumstances as possible. In this case, the company needs

45%

24%

9%

17%

5%

it guarantee the performance of all of their critical applica-

a strategic solution.

Table 5.2: Deployment of Branch Office Optimization Solutions

Historically, Branch Office Optimization Solutions have been implemented in a tactical fashion. That means that companies have deployed the least amount of equipment possible to solve a specific problem.

Kubernan recently

asked several hundred IT professionals about the tactical vs.

One conclusion that can be drawn from the data in Table 5.2 is: The deployment of WAN Optimization Controllers will increase significantly.

strategic nature of how they use these techniques. Their

The Engineering CIO stated that his organization originally

answers, which Figure 5.2 shows, indicate the deployment

deployed a WAFS solution to alleviate redundant file copy.

of these techniques is becoming a little more strategic.

He said he has been pleasantly surprised by the additional

The Electronics COO who noted that his company’s initial deployment of network and application optimization

benefits of using the solution. In addition, his organization plans on doing more backup of files over the network and


30


10 points to 50 points, with 10 points 60%

meaning not important, 30 points

51%

meaning average importance, and

50% 40%

50 points meaning critically impor-

32%

tant. The score for each criteria can

30% 20%

range from 1 to 5, with a 1 mean-

17%

ing fails to meet minimum needs, 3 meaning acceptable, and 5 meaning

10% 0%

significantly exceeds requirements. Rich Set of Techniques, Proactively Deployed

Some Techniques, Proactively Deployed

Driven by Particular Initiatives

For the sake of example, consider solution A. For this solution, the

Figure 5.2: Approach to Network and Application Optimization

he expects the WAFS solution they have already deployed will assist with this.

weighted score for each criterion (WiAi) is found by multiplying the weight (Wi) of each criteria, by the score of each criteria (Ai). The weighted score for each criterion are then

The points The Engineering CIO raised go back to the

summed (Σ WiAi) to get the total score for the solution.

previous discussion of a tactical vs. a strategic solution. In

This process can then be repeated for additional solutions

particular, most IT organizations that deploy a network and

and the total scores of the solutions can be compared.

application optimization solution do so tactically and later expand the use of that solution to be more strategic. When choosing a network and application optimization solution it is important to ensure that the solution can scale to provide additional functionality over what is initially required.

Criterion

Weight Wi

Score for Solution “A” Ai

Score for Solution “B” Bi

Performance Transparency Solution Architecture OSI Layer

Selection Criteria The recommended criteria for evaluating WAN Optimization Controllers are listed in Table 5.3. This list is intended as a

Capability to Perform Application Monitoring Scalability Cost-Effectiveness

fairly complete compilation of all possible criteria, so a given

Application Sub-classification

organization may apply only a subset of these criteria for a

Module vs. Application Optimization

given purchase decision. In addition, individual organizations are expected to ascribe different weights to each of the criteria because of differences in WAN architecture, branch office network design, and application mix. As shown in the table, assigning weights to the criteria and relative scores for each solution provides a simple methodology for comparing competing solutions. There are many techniques that IT organizations can use to complete Table 5.3 and then use its contents to com-

Disk vs. RAM-based Compression Protocol Support Security Ease of Deployment and Management Change Management Support for Meshed Traffic Support for Real Time Traffic Total Score

ɝWiAi

ɝWiBi

Table 5.3: Criteria for WAN Optimization Solutions

pare solutions. For example, the weights can range from


31


Each of the criteria is explained below.

OSI Layer

Performance

Organizations can apply many of the optimization tech-

Third party tests of a solution can be helpful. It is critical, however, to quantify the kind of performance gains that the solution will provide in the particular environment where it will be installed. For example, if the IT organization either already has, or is in the process of consolidating servers out of branch offices and into centralized data centers, then it needs to test how well the WAN optimization solution supports CIFS. As part of this quantification, it is important to identify if the performance degrades as either additional functionality within the solution is activated, or if the solution is deployed more broadly across the organization. Transparency The first rule of networking is to not implement anything that causes the network to break. As a result, an important criterion when choosing a WOC is that it should be possible to deploy the solution without breaking things such as routing, security, or QoS. The solution should also be transparent relative to both the existing server configurations and the existing Authentication, Authorization and Accounting (AAA) systems, and should not make troubleshooting any more difficult. The transparency of these solutions has been a subject of much discussion recently. Because of that, the next section will elaborate on what to look for relative to transparency. Solution Architecture

niques discussed in this handbook at various layers of the OSI model. They can apply compression, for example, at the packet layer. The advantage of applying compression at this layer is that it supports all transport protocols and all applications. The disadvantage is that it cannot directly address any issues that occur higher in the stack. Alternatively, having an understanding of the semantics of the application means that compression can also be applied to the application; e.g., SAP or Oracle. Applying compression, or other techniques such as request prediction, in this manner has the potential to be more effective, but is by definition application specific. Capability to Perform Application Monitoring Many network performance tools rely on network-based traffic statistics gathered from network infrastructure elements at specific points in the network to perform their reporting. By design, all WAN optimization devices apply various optimization techniques on the application packets and hence affect these network-based traffic statistics to varying degrees. One of the important factors that determine the degree of these effects is based on the amount of the original TCP/IP header information retained in the optimized packets. This topic will be expanded in the subsequent section on transparency. Scalability One aspect of scalability is the size of the WAN link that can be terminated on the appliance. More important is how much throughput the box can actually support with

If the organization intends the solution to be able to

the relevant and desired optimization functionality turned

support additional optimization functionality over time, it

on. Other aspects of scalability include how many simul-

is important to determine if the hardware and software

taneous TCP connections the appliance can support, as

architecture can support new functionality without an

well as how many branches or users a vendor’s complete

unacceptable loss of performance.

solution can support.


32


Downward scalability is also important. Downward

than compensate for this extra delay. While disks are more

scalability refers to the ability of the vendor to offer cost-

cost-effective than a RAM-based solution on a per byte

effective products for small branches or even individual

basis, given the size of these systems they do add to the

laptops.

overall cost and introduce additional points of failure to a

Cost Effectiveness This criterion is related to scalability. In particular, it is important to understand what the initial solution costs. It

solution. Standard techniques such as RAID can mitigate the risk associated with these points of failure. Protocol support

is also important to understand how the cost of the solu-

Some solutions are specifically designed to support

tion changes as the scope and scale of the deployment

a given protocol (e.g., UDP, TCP, HTTP, Microsoft Print

increases.

Services, CIFS, MAPI) while other solutions support that

Application Sub-classification An application such as Citrix Presentation Server or SAP is comprised of multiple modules with varying characteristics. Some Branch Office Optimization solutions can classify at the individual module level, while others can only classify at the application level. Module vs. Application Optimization In line with the previous criterion, some Branch Office Optimization Solutions treat each module of an application in the same fashion. Other solutions treat modules based both on the criticality and characteristics of that module. For example, some solutions apply the same optimization techniques to all of SAP, while other solutions would apply different techniques to the individual SAP modules based on factors such as their business importance and latency sensitivity. Disk vs. RAM Advanced compression solutions can be either disk or RAM-based. Disk-based systems typically can store as much as 1,000 times the volume of patterns in their dictionaries as compared with RAM-based systems, and

protocol generically. In either case, the critical issue is how much of an improvement in the performance of that protocol the solution can cause in the type of environment in which the solution will be deployed. It is also important to understand if the solution makes any modifications to the protocol that could cause unwanted side effects. Security The solution must be compatible with the current security environment. It must not, for instance, break firewall Access Control Lists (ACLs) by hiding TCP header information. In addition, the solution itself must not create any additional security vulnerabilities. Easy of Deployment and Management As part of deploying a WAN optimization solution, an appliance needs to be deployed in branch offices that will most likely not have any IT staff. As such, it is important that unskilled personnel can install the solution. In addition, the greater the number of appliances deployed, the more important it is that they are easy to configure and manage.

those dictionaries can persist across power failures. The

It’s also important to consider what other systems will

data, however, is slower to access than it would be with

have to be modified in order to implement the WAN

the typical RAM-based implementations, although the per-

optimization solution. Some solutions, especially cache-

formance gains of a disk-based system are likely to more


33


based or WAFS solutions, require that every file server be

low bandwidth connections. The latter involves continu-

accessed during implementation.

ous sessions with many packets sent over high capacity

Change Management

WAN links. Data applications can typically recover from lost or out of order packets by retransmitting the lost data.

Since most networks experience periodic changes such

Performance might suffer, but the results are not cata-

as the addition of new sites or new applications, it is

strophic. Data replication applications, however, do not

important that the WAN optimization solution can adapt

have the same luxury. If packets are lost, throughput can

to these changes easily. It is preferable that the WAN

be decreased so significantly that the replication process

optimization solution be able to adjust to these changes

cannot be completed in a reasonable timeframe - if at all.

automatically. Support of Meshed Traffic

Key WAN Characteristics: Loss and Out of Order Packets

A number of factors are causing a shift in the flow of

As noted in Chapter 3, many IT organizations are mov-

WAN traffic away from a simple hub-and-spoke pattern to

ing away from a hub and spoke network and are adopting

more of a meshed flow. If a company is making this tran-

WAN services such as MPLS and IP VPNs. While there

sition, it is important that the WAN optimization solution

are significant advantages to MPLS and IP VPN services,

that they deploy can support meshed traffic flows and can

there are drawbacks, one of which being high levels of

support a range of features such as asymmetric routing.

packet loss and out of order packets. This is due to rout-

Support for Real Time Traffic Many companies have deployed real-time applications. For these companies it is important that the WAN optimization solution can support real time traffic.

ers being oversubscribed in a shared network, resulting in dropped or delayed packet delivery. The Manager stated that the packet loss on a good MPLS network typically ranges from 0.05% to 0.1%, but that it can reach 5% on some MPLS networks. He added

Traffic such as VOIP and live video typically can’t be

that he sees packet loss of 0.5% on the typical IPSec VPN.

accelerated because it is real time and already highly com-

The Consultant said that packet loss on an MPLS network

pressed. Header compression might be helpful for VoIP

is usually low, but that in 10% of the situations that he has

traffic, and most real time traffic will benefit from QoS.

been involved in, packet loss reached upwards of 1% on a

The Data Replication Bottleneck While packet loss and out of order packets are a nuisance for a network that supports typical data applications13 like file transfer and email, it is a very serious problem when performing data replication and backup across the WAN. The former involves thousands of short-lived sessions made up of a small number of packets typically sent over 13 The phrase typical data application refers to applications that

continuous basis. Both The Manger and The Consultant agreed that out of order packets are a major issue for data replication, particularly on MPLS networks. In particular, if too many packets (i.e. typically more than 3) are received out of order, TCP or other higher level protocols will cause a re-transmission of packets. The Consultant stated that since out of order packets cause re-transmissions, it has the same affect on

involve inquiries and responses where moderate amounts of information are transferred for brief periods of time. Examples include file transfer, email, web and VoIP. This is in contrast to a data replication application that transfers large amounts of information for a continuous period of time.


34


goodput14 as does packet loss. He added that he often

More specifically:

sees high levels of out of order packets in part because

With a 1% packet loss and a round trip time of

some service providers have implemented queuing algo-

50 ms or greater, the maximum throughput is

rithms that give priority to small packets and hence cause

roughly 3 megabits per second no matter how

packets to be received out of order.

large the WAN link is.

The Impact of Loss and Out of Order Packets

The Consultant stated that he thought Figure 5.3 over-

The affect of packet loss on TCP has been widely ana-

stated the TCP throughput and that in his experience if you have a WAN link with an RTT of 100 ms and packet loss of 1% “the throughput would be nil”. Techniques for Coping with Loss and Out of Order

lyzed15. Mathis, et.al., provide a simple formula that pro-

Packets

vides insight into the maximum TCP throughput on a single

The data in Figure 5.3 shows that while packet loss

session when there is packet loss. That formula is:

affects throughput for any TCP stream, it particularly

where: MSS: maximum segment size

40.0

RTT: round trip time p: packet loss rate. The preceding equation shows that throughput decreases as either RTT or p increases. To exemplify the impact

30.0

is 100 ms., and p is 0.01%. Based on the formula, the maximum throughput is 1,420 Kbytes/second. If however, the loss were to increase to 0.1%, the maximum throughput drops to 449 Kbytes/second.

Figure 5.3

depicts the impact that packet loss has on the throughput of a single TCP stream with a maximum segment size of 1,420 bytes and varying values of RTT.

Max Thruput (Mbps)

of packet loss, assume that MSS is 1,420 bytes, RTT Latency 100ms 50ms

20.0

10ms

10.0

One conclusion that can be drawn from Figure 5.3 is:

14 Goodput refers to the amount of data that is successfully

transmitted. For example, if a thousand bit packet is transmitted ten times in a second before it is successfully received, the throughput is 10,000 bits/second and the goodput is 1,000 bits/ second. 15 The macroscopic behavior of the TCP congestion avoidance algorithm by Mathis, Semke, Mahdavi & Ott in Computer Communication Review, 27(3), July 1997

% 00 .0

0%

10

0%

00 5.

0%

00

00

2.

0% 50

1.

0%

0.

20

0%

0.

10

0%

0.

0%

05 0.

0%

02 0.

TCP session.

0.0 01

reduce the maximum throughput of a single

0.

Small amounts of packet loss can significantly

Packet Loss Probability

Figure 5.3: Impact of Packet Loss on Throughput


35


affects throughput for high-speed streams, such as those

application regardless of transport protocol. FEC, however,

associated with multi-media and data replication.

As

introduces overhead which itself can reduce throughput.

a result, numerous techniques, such as Forward Error

What is needed is a FEC algorithm that adapts to packet

16

Correction (FEC) , have been developed to mitigate the

loss. For example, if a WAN link is not experiencing packet

impact of packet loss.

loss, no extra packets should be transmitted. When loss is

FEC has long been used at the physical level to ensure error free transmission with a minimum of re-transmissions. Recently many enterprises have begun to use FEC at the network layer to improve the performance of applications such as data replication. The basic premise of FEC is that an additional error recovery packet is transmitted for every ‘n’ packets that are sent. The additional packet enables the network equipment at the receiving end to reconstitute one of the ‘n’ lost packets and hence negates the actual packet loss. The ability of the equipment at the receiving end to reconstitute the lost packets depends on how many packets were lost and how many extra packets were transmitted. In the case in which one extra packet is carried for

detected, the algorithm should begin to carry extra packets and should increase the amount of extra packets as the amount of loss increases.

Transparency Vendors and IT organizations often cite transparency as a key attribute of application accelerators. In the most general sense of the term, a completely transparent application accelerator solution is one that can be added to the existing network without requiring reconfiguration of any existing network elements or end systems and without disrupting any services that are provided within the network between the pair of accelerators.

every ten normal packets (1:10 FEC), a 1% packet loss can

For example, an in-line accelerator deployment such as

be reduced to less than 0.09%. If one extra packet is car-

shown in Figure 5.4 is more generally transparent than

ried for every five normal packets (1:5 FEC), a 1% packet

an out-of-line deployment, where “optimizable” traffic is

loss can be reduced to less than 0.04%. To exemplify the

directed to the accelerator by configuring WCCP (The Web

impact of FEC, assume that the MSS is 1,420, RTT is 100

Cache Coordination Protocol) or PBR (policy-based rout-

ms, and the packet loss is 0.1%. Transmitting a 10 Mbyte

ing) on the WAN router.

file without FEC would take a minimum of 22.3 seconds. Using a 1:10 FEC algo-

Intervening Network

rithm would reduce

Accelerator Appliance

Accelerator Appliance

this to 2.1 seconds and a 1:5 FEC algorithm would reduce this to 1.4 seconds. The example demonstrates the value of FEC in a TCP environment,

although

the technique applies equally well to any

WAN WAN Router

WAN Router

Switch

Switch

Central Site

Remote Site Figure 5.4: In-Line Deployment of Application Accelerators

16 RFC 2354, Options for Repair of Streaming Media,

http://www.rfc-archive.org/getrfc.php?rfc=2354


36


In most production networks, complete transparency

by having the accelerators adopt the source/destination/

is not achievable. As a result, the question comes down

port addressing of the conversation end points for each

to which application accelerator solution has the higher

flow. This means that the optimized packet addressing is

degree of transparency and therefore requires fewer

identical to the addressing used if no optimization were

workarounds and reconfiguration tasks. The key point here

being performed. Therefore, the intervening network com-

is that the degree of transparency depends on several

ponents continue to see the network addresses of the

factors:

conversation end points.

• Deployment topology (in-line vs. out-of-line)

Actual Addressing

• Transparency–related attributes of the accelerator

In this case, the accelerators have their own distinct

solution • Nature of the intervening network • Types of network monitoring and security devices • Type of applications being optimized Assuming that the deployment topology, intervening network, security functionality and range of applications are determined by other considerations, the degree of transparency of an application accelerator solution will be determined primarily by the IP packet header content used in the traffic between the two appliances and how this interacts with the network elements, security components and configured services of the intervening network. There are three approaches for addressing IP traffic between accelerator appliances that are being employed by various vendors: Tunneling A tunnel encapsulates the original packets in an additional IP header using the IP addresses of the accelerators in the enveloping packet. With tunnels, an overlay network consisting of static paths between pairs of accelerators must be configured. Transparent Addressing Here, the two appliances are spoofing any intermediate network components into thinking that they are the original source and destination end systems. This is accomplished

network addresses, which means that the IP addresses of the originating conversation partners are replaced by the actual appliance addresses for the appliance-to-appliance hop. Therefore, the intervening network components see the actual addresses of the acceleration appliances rather than endpoint addresses. In order to compare transparent addressing vs. actual addressing appliances based on degree of transparency they offer, a network manager needs to consider the specific characteristics of his/her intervening network and how they would be affected by the differences in addressing. In general, if there is lack of transparency in the intervening network, the workaround is to move the interacting functionality to be upstream of the accelerator rather than downstream. The following is a list of some of the intervening network elements and services that may be affected by the addressing difference: Quality of Service (QoS) Classification If packet classification takes place downstream of the accelerator (either in the WAN edge router or somewhere else in the intervening network) and the assigned traffic class depends at least partially on source and destination addresses, then actual addressing will require QoS reconfiguration while transparent addressing would not in most cases. Even with transparent addressing. reconfiguration would be required if the classification depends partially on


37


application payload data (deep packet inspection or DPI) and the optimization process for applications has altered the payload (e.g., via compression).

Security Devices If firewalls, IPS, or IDS systems are present within the intervening network, their policies based on the IP address

If transparent addressing is used, the compressed traffic

5-tuple (source and destination addresses, source and

cannot be distinguished from the uncompressed bypass

destination ports numbers, and protocol) may need to be

traffic.

This can have an impact if the IT organization

reconfigured to pass traffic between accelerators using

wants to give a different QoS priority level to traffic that is

actual addressing. If any of these security devices use

compressed and accelerated by the accelerator, compared

policies based on DPI to examine payloads for malicious

to traffic that bypasses the accelerator.

signatures, reconfiguration may be required for both actual

Traffic Monitoring If the probes monitoring WAN traffic are downstream

and transparent addressing. Asymmetric Routing

of the accelerator, then actual addressing obscures the

Asymmetric routing occurs when the packets flowing

identity of the flow’s end points because the probe sees all

between end systems A and B use one path for traffic

optimized traffic as originating and terminating at the accel-

from A to B and another path for traffic from B to A. With

erators. Therefore. visibility of the end system address

asymmetric routing there are some conditions where a

(e.g., for identification of “top talkers”) is obscured unless

transparently addressed optimized packet would arrive at

the probe is moved upstream or data is garnered from the

the destination end system instead of at the accelerator

accelerator. With a transparent addressing accelerator,

using the same address. If this problem occurs, resolution

the traffic monitoring tools will continue to show network

may require configuring WCCP on the WAN edge router.

usage based on the originating end systems.

Because there is no duplication of addresses, accelerators

Access Control Lists (ACLs) and PBR If ACLs have been deployed within intervening network devices to block certain traffic flows based on IP address information, they will continue to block the flows between accelerators regardless of the addressing scheme. This is true because the initial setup of the network session based is always based on originating system addresses would be blocked regardless of the addressing scheme for optimized traffic. For ACLs or PBR that enable or redirect traffic based on the IP addressing, some reconfiguration or workarounds will be necessary for actual addressing, but not for transparent addressing. If the PBR decision is partially based on DPI, then this may not continue to work correctly even with transparent addressing if the application optimization has altered the payload sufficiently.

using actual addressing are transparent to asymmetric routing. From this analysis it is quite clear that the degree of transparency of a WAN optimization solution depends largely on the characteristics and services of the existing network between co-operating pairs of accelerators. For example, if most network services are performed at the edge of the network and the WAN core network is comparatively “dumb”, adding accelerators of either type (actual or transparent addressing) will be likely to have network-wide transparency.

Application Front Ends (AFEs) Background As previously mentioned, an historical precedent exists to the current generation of AFEs (a.k.a., ADCs). That precedent is the Front End Processor (FEP) that was introduced


38


in the late 1960s and was developed and deployed in order to support mainframe computing. From a more contemporary perspective, the current generation of AFEs evolved from the earlier generations of Server Load Balancers (SLBs) that were deployed in front of server farms.

• XML Offload Another function that can be provided by the AFE (as well as standalone devices) is to offload XML processing from the servers by serving as an XML gateway. As was described in Chapter

While an AFE still functions as a SLB, the AFE has

3, Web services and Web 2.0 applications are

assumed, and will most likely continue to assume, a wider

XML based, and XML is a verbose protocol that

range of more sophisticated roles that enhance server

is CPU-intensive. Hence, one of the roles of an

efficiency and provide asymmetrical functionality to accel-

XML gateway is to offload XML processing from

erate the delivery of applications from the data center to

the general-purpose servers and to perform this

individual remote users.

processing on hardware that was purpose-built for

An AFE provides more sophisticated functionality than a SLB does. Among the functions that users can expect from a modern AFE are the following: • Traditional SLB

this task. Another role of an XML gateway is to provide additional security functionality to protect against the kinds of attacks that were described in Chapter 3. • Application Firewalls AFEs may also provide an additional layer of secu-

AFEs can provide traditional load balancing across

rity for Web applications by incorporating applica-

local servers or among geographically dispersed

tion firewall functionality. Application Firewalls are

data centers based on Layer 4 through Layer

focused on blocking application-level attacks that

7 intelligence. SLB functionality maximizes the

are becoming increasingly prevalent. As described

efficiency and availability of servers through intelli-

in the section of the handbook that deals with

gent allocation of application requests to the most

Next Generation firewalls, application firewalls

appropriate server.

are typically based on Deep Packet Inspection

• SSL Offload

(DPI), coupled with session awareness and behavioral models of normal application interchange. For

One of the primary new roles played by an AFE

example, an application firewall would be able to

is to offload CPU-intensive tasks from data center

detect and block Web sessions that violate rules

servers. A prime example of this is SSL offload,

defining the normal behavior of HTTP applications

where the AFE terminates the SSL session by

and HTML programming. Therefore, Application

assuming the role of an SSL Proxy for the servers.

Firewalls complement traditional perimeter fire-

As previously mentioned, SSL offload can provide

walls that are based on recognition of known

a significant increase in the performance of secure

network-level attack signatures and patterns.

intranet or Internet Web sites. SSL offload frees

Application Firewalls also have the advantage of

up server resources, allowing existing servers to

providing a measure of protection against zero day

process more requests for content and handle

exploits by blocking the sessions of clients whose

more transactions.

behaviors are outside the bounds of admissibility.


39


• Asymmetrical Application Acceleration AFEs can accelerate the performance of applications delivered over the WAN by implementing optimization techniques, such as reverse caching, asymmetrical TCP optimization, and compression. With reverse caching, new user requests for static or dynamic Web objects can often be delivered from the cache rather than having to be regenerated by the servers. Reverse caching therefore improves user response time and minimizes loading on Web servers, application servers, and database servers.

real-time and historical monitoring and reporting of the response time experienced by end users accessing Web applications. The AFE can provide the granularity to track performance for individual Web pages and to decompose overall response time into client-side delay, network delay, AFE delay, and server-side delay. The resulting data can be used to support SLAs for guaranteed user response times, guide remedial action, and plan additional capacity to maintain service levels.

Selection Criterion The AFE evaluation criteria are listed in Table 5.4. As

Asymmetrical TCP optimization is based on the

was the case with Branch Office Optimization Solutions,

AFE serving as a proxy for TCP processing, mini-

this list is intended as a fairly complete compilation of pos-

mizing the server overhead for fine-grained TCP

sible criteria. As a result, a given organization or enterprise

session management. TCP proxy functionality

might apply only a subset of these criteria for a given pur-

is designed to deal with the complexity associ-

chase decision.

ated with the fact that each object on a Web page requires its own short-lived TCP connection. Processing all of these connections can consume an inordinate about of the server’s CPU resources,

Criterion

side TCP sessions and multiplexes numerous

Transparency and Integration Solution Architecture

side object requests into a single longer-lived ses-

Functional Integration

sion between the AFE and the Web servers.

Virtualization

The AFE can also offload Web servers by per-

Application Availability

forming compute-intensive HTTP compression

Cost-Effectiveness

Security

operations. HTTP compression is a capability built

Ease of Deployment and Management

into both Web servers and Web browsers. Moving

Business Intelligence

HTTP compression from the Web server to the

Total Score

AFE is transparent to the client and so requires no

Table 5.4: Criteria for Evaluating AFEs

metrical in the sense that there is no requirement for additional client-side appliances or technology. • Response Time Monitoring The application and session intelligence of the AFE also presents an opportunity to provide

Score for Solution “B” Bi

Performance Scalability

client modifications. HTTP compression is asym-

Score for Solution “A” Ai

Features

Acting as a proxy, the AFE terminates the clientshort-lived network sessions initiated as client-

Weight Wi

ɝWiAi

ɝWiBi

Each of the criteria is described below. Features AFEs support a wide range of functionality including TCP optimization, HTTP multiplexing, caching, Web compression, image compression as well as bandwidth management and traffic shaping.


40


Performance Performance is an important criterion for any piece of networking equipment, but it is critical for a device such as an AFE, because data centers are central points of aggregation. As such, the AFE needs to be able to support the extremely high volumes of traffic transmitted to and from servers in data centers. A simple definition of performance is how many bits per

other appliances that may be deployed to provide application services. In some data centers, it may be important to integrate the Layer 2 and Layer 3 access switches with the AFE and firewalls so that all that application intelligence, application acceleration, application security, and server offloading are applied at a single point in the data center network. Scalability

second the device can support. While this is extremely

Scalability of an AFE solution implies the availability of

important, in the case of AFEs other key measures of

a range of products that span the performance and cost

performance include how many Layer 4 connections can

requirements of a variety of data center environments.

be supported as well as how many Layer 4 setups and

Performance requirements for accessing data center appli-

teardowns can be supported.

cations and data resources are usually characterized in

Third party tests of a solution can be helpful. It is critical, however, to quantify the kind of performance gains that the solution will provide in the particular application environment where it will be installed. As part of this quantification, it is important to identify if the performance of the solution degrades as either additional functionality within the solution is activated or if there are changes made to the application mix within the data center. Transparency and Integration Transparency is an important criterion for any piece of networking equipment. However, unlike branch office optimization solutions that are proprietary, AFEs are standards based. As such, AFEs will tend to be somewhat more transparent than other classes of networking equipment. It is very important to be able to deploy an AFE solution

terms of both the aggregate throughput of the AFE and the number of simultaneous application sessions that can be supported. A related consideration is how device performance is affected as additional functionality is enabled. Solution Architecture Taken together, scalability and solution architecture identify the ability of the solution to support a range of implementations and to extend to support additional functionality. In particular, if the organization intends the AFE to be able to support additional optimization functionality over time, it is important to determine if the hardware and software architecture can support new functionality without an unacceptable loss of performance and without unacceptable downtime. Functional Integration

and not break anything such as routing, security, or QoS.

In many data center environments there are programs

The solution should also be as transparent as possible

in progress to reduce overall complexity by consolidat-

relative to both the existing server configurations and the

ing both the servers and the network infrastructure.

existing security domains, and should not make trouble-

An AFE solution can contribute significantly to network

shooting any more difficult.

consolidation by supporting a wide range of application-

The AFE also has to be able to easily integrate with other components of the data center, such as the firewalls, and

aware functions that transcend basic server load balancing and content switching. Extensive functional integration reduces the complexity of the network by minimizing the


41


number of separate boxes and user interfaces that must

cation-specific security features that complement general

be navigated by data center managers and administrators.

purpose security measures, such as firewalls as well as

Reduced complexity generally translates to lower TCO and

IDS and IPS appliances. In addition, the solution itself must

higher availability.

not create any additional security vulnerabilities.

Virtualization As is discussed in Chapter 9, virtualization is becoming a key technology for achieving data center consolidation and the related benefits. For example, server virtualization supports data center consolidation by allowing a number of applications running on separate virtual machines to share a single physical server. Prior to virtualization, a common practice was to run only one application per server to maximize operating system stability. Not only was the extra hardware this approach required expensive, but it also necessitated additional real estate and power, further increasing the cost.

Security functionality that IT organizations should look for in an AFE includes protection against denial of service attacks, integrated intrusion protection, protection against SSL attacks and sophisticated reporting. Application Availability The availability of enterprise applications is typically a very high priority. Since the AFE is in-line with the Web servers and other application servers, a traditional approach to defining application availability is to make sure that the AFE is capable of supporting redundant, high availability configurations that feature automated fail-over among the redundant devices. While this clearly is impor-

AFEs can also be virtualized by partitioning a single

tant, there are other dimensions to what is meant by appli-

physical AFE into a number of logical AFEs or AFE con-

cation availability. For example, as previously mentioned,

texts. Each logical AFE can be configured individually to

an architecture that enables scalability through the use of

meet the server-load balancing, acceleration, and security

software license upgrades tends to minimize the applica-

requirements of a single application or a cluster of appli-

tion downtime that is associated with hardware-centric

cations. Therefore, each virtualized AFE can consolidate

capacity upgrades.

the functionality of a number of physical AFEs dedicated to the support of single applications. Virtualization adds

Cost Effectiveness

significantly to the flexibility of the data center by allowing

This criterion is related to scalability. In particular, it is

applications to be easily moved from one physical server to

important to understand what the initial solution costs. It

another. For example, with a virtual AFE mapped to a vir-

is also important to understand how the cost of the solu-

tual machine, the AFE would not need to be reconfigured

tion changes as the scope and scale of the deployment

when an application is moved or automatically fails-over to

increases.

a new physical machine. Benefits of virtualization include lowering TCO through consolidation of AFE physical devic-

Ease of Deployment and Management

es, higher availability when faced with a failover, plus the

As with any component of the network or the data cen-

associated savings in management costs and power and

ter, an AFE solution should be relatively easy to deploy

cooling costs.

and manage. It should also be relatively easy to deploy

Security The solution must be compatible with the current security environment, while also allowing the configuration of appli-

and manage new applications and so ease of configuration management is a particularly important consideration where a wide diversity of applications is supported by the data center.


42



leverage economies of scale by supplying the same type

In addition to traditional network functionality, some AFEs also provide data that can be used to provide business level functionality. In particular, data gathered by an AFE can feed security information and event monitoring, fraud management, business intelligence, business process management and Web analytics.

of service to numerous customers. Leverage the MSP’s Management Processes The ADMSP should also be able to leverage sophisticated processes in all phases of application delivery, including application assessment, planning, optimization, management, and control. In particular, the ADMSP’s scale of

Managed Service Providers

operations justifies their investment in highly automated

Introduction

processes that can greatly enhance the productivity of

management tools and more sophisticated management

As previously noted, virtually all organizations are under

operational staff. The efficiency of all these processes can

increasing pressure to ensure acceptable performance

further reduce the OPEX cost component underlying the

for networked applications.

service.

Many IT organizations are

responding to this challenge by enhancing their understanding of application performance issues and then implementing their own application delivery solutions based on the products discussed in the preceding chapter. Other IT organizations prefer to outsource all or part of application delivery to a Managed Service Provider (MSP). There is a wide range of potential benefits that may be gained from outsourcing to an Application Delivery MSP (ADMSP), including: Reduce Capital Expenditure

The ability to leverage the MSP’s management processes is a factor that could cause an IT organization to use an MSP for a variety of services, including the provision of basic transport services. This criterion, however, is particularly important in the case of application delivery because as will be shown in the next Chapter, ineffective processes is one of the most significant impediments to successful application delivery. Leverage the MSP’s Expertise In most cases, ADMSPs will have broader and deeper

In cases where the ADMSP provides the equipment as

application-oriented technical expertise than an enterprise

CPE bundled with the service, the need for capital expen-

IT organization can afford to accumulate. This higher level

diture to deploy application optimization solutions can be

of expertise can result in full exploitation of all available

avoided.

technologies and optimal service implementations and

Lower the Total Cost of Ownership (TCO) In addition to reducing capital expenditure, managed application delivery services can also reduce operational expense (OPEX) related to technical training of existing employees in application optimization or hiring of additional personnel with this expertise. In terms of OPEX, the customer of managed services can also benefit from the lower cost structure of ADMSP operations, which can

configurations that can increase performance, improve reliability, and further reduce TCO. Similar to the discussion in the preceding paragraph, the ability to be able to leverage the MSP’s expertise is a factor that could cause an IT organization to use an MSP for a variety of services. This criterion, however, is particularly important in the case of application delivery because the typical IT organization does not have personnel who have


43


a thorough understanding of both applications and networks, as well as the interaction between them.

Interest in Managed Services Kubernan asked 200 IT professionals about their prefer-

Leverage the MSP’s Technology

ence in a DIY approach vs. using a managed service pro-

Because of economies of scale, ADMSP facilities can

Table 6.1 contains their responses.

vider for a number of tasks related to application delivery.

take full advantage of the most advanced technologies Perform it Themselves

Performed by 3rd Party

No Preference

Profiling an application prior to deploying it

79.5%

11.4%

9.1%

Baselining the performance of the network

79.6%

15.5%

5.0%

Baselining the performance of key applications

75.6%

13.9%

10.6%

Assessing the infrastructure prior to deploying a key new application such as VoIP

76.0%

16.8%

7.3%

in building their facilities to support service delivery. This allows the customer of managed application delivery services to gain the benefits of technologies and facilities that are beyond the reach of the typical IT budget. Timely Deployment of Technology Incorporating a complex application delivery solution in the enterprise network can be quite time consuming, especially where a significant amount of training or hiring is required. In contrast, with a managed service, the learning curve is essentially eliminated, allowing the solution to be deployed in a much more timely fashion. Better Strategic Focus

Function

Table 6.1: Preference for Performing Planning Functions

Once conclusion that can be drawn from Table 6.1 is that between 75 and 80 percent of IT organizations prefer to

The availability of managed application delivery services

perform the indicated planning functionality themselves.

can free up enterprise IT staff facilitating the strategic

Conversely, another conclusion that can be drawn is that

alignment of in-house IT resources with the enterprise

between 20 and 25 percent of IT organizations either pre-

business objectives. For example, in-house IT can focus

fer to have the indicated functionality performed by a third

on a smaller set of technologies and in-house services

party or are receptive to that concept.

that are deemed to be of greater strategic value to the business. Enhanced Flexibility Managed application delivery services also provide a degree of flexibility that allows the enterprise to adapt rap-

Different Types of Managed Application Delivery Services Currently, there are two primary categories of managed application delivery service environments: 1. Site-based services comprised of managed WAN

idly to changes in the business environment resulting from

Optimization Controllers (WOCs) installed at partici-

competition or mergers/acquisitions. In addition, with an

pating enterprise sites

ADMSP, the enterprise may be able to avoid being locked in to a particular equipment vendor due to large sunk costs in expertise and equipment.

2. Internet-based services that deal with acceleration of applications (e.g, web access and SSL VPN access) that traverse the Internet


44


Site-based Services

to be expensive and there tends to be a long lead-time

These services are based on the deployment of managed

associated with deploying new MPLS services.

WOC CPE at the central data center and at each remote

An area where the use of MPLS is inappropriate is the

site participating in the application optimization project, as

support of home and nomadic workers. It is not possible

illustrated in Figure 6.1. The WAN depicted in the figure is

to use MPLS to support nomadic workers who need con-

typically a private leased line network or a VPN based on

nectivity from virtually anywhere, such as a hotel room, a

Frame Relay, ATM or MPLS. The application optimization

coffee shop, or an airport. And, while it is possible to use

service may be offered as an optional add-on to a WAN

MPLS to support home workers, it is typically prohibitively

service or as a standalone service that can run over WAN

expensive.

services provided by a third party. Where the application delivery service is bundled with a managed router and WAN service, both the WOC and the WAN router would

Since the Internet also supports meshed traffic flows, it also is being used by many enterprises as an alternative to legacy WAN services such as frame relay and ATM.

cannot provide low, predictable delay. Some MSPs,

WOC

WOC

Servers WAN WAN Router

A major weakness of the Internet is that it

however, have deployed services built on top of the Internet that are intended to mitigate this weakness. These services are focused on optimizing the delivery

WAN Router

AFE

of applications over the Internet in part to allow the Internet to be exploited as a lower cost WAN alterna-

Remote Site

Central Data Center Site

Figure 6.1: Site-Based Application Delivery Services

tive. Internet-based services are based primarily on proprietary application acceleration and WAN optimization servers located at MSP points of presence

be deployed and managed by the same MSP. The AFE shown in the figure is performing firewall, load balancing, and similar functions that may or may not be included in the MSP offering. Site-based services are generally based on MSP deployment of WOCs that are commercially available from the vendors that are addressing the enterprise market for application acceleration/optimization. Internet-based Services As noted in Chapter 3, many IT organizations are moving away from a hub and spoke network based on frame relay and ATM and are adopting WAN services such as MPLS. Like any WAN service, MPLS has advantages and disadvantages. One of the advantages of MPLS is that it is widely deployed. One of the disadvantages of MPLS is that similar to Frame Relay and ATM, MPLS services tend

(PoPs) distributed across the Internet and do not require that remote sites accessing the services have any special hardware or software installed. Accelerating Web Applications Figure 6.2 shows how an Internet-based service focused on the acceleration of Web applications is typically delivered. The remote site connects to a nearby ADMSP PoP typically using a broadband connection. As noted above, there is no requirement for any remote site CPE or additional client software beyond the conventional Web browser. The servers deployed across the MSP geographically dispersed PoPs form an intelligent distributed processing infrastructure and perform a variety of AFE/WOC functions to accelerate the web traffic and optimize traffic flows through the Internet. The techniques that are avail-


45


able to accelerate web

Web Acceleration Servers

traffic include:

MSP PoPs

Web Servers

• Dynamic mapping of the remote site ISP

to the optimum local PoP based on PoP server utiliza-

Internet

ISP

WAN Router

WAN Router

tion and real-time data on Internet

Remote Site

AFE/ Firewall


traffic loading. • Caching Web con-

Figure 6.2: Internet-Based Web Application Delivery Services

tent at the local

of the Internet are minimized with Dynamic Route

PoP with intelligent pre-fetch of data from the cen-

Optimization.

tral or origin web servers. Local caching is much more effective than reverse caching at a central site because multiple Internet roundtrip delays are avoided for every local cache hit. • Compression of Web content transferred between the ingress and egress PoPs • Dynamic Route Optimization to minimize latency

Accelerating any IP-based Application Another category of Internet-based service provides acceleration of any application that runs over the Internet Protocol (IP). As shown in Figure 6.3, these services typically require the installation of managed CPE Application Acceleration servers at the central site that interoperate with the Application Acceleration servers in the MSP

and packet loss. The determination of the best route

PoPs. As is the case with the Internet-based web services,

through the Internet requires that the servers at

there is typically no requirement for remote site CPE or

each PoP gather dynamic information on the perfor-

special client software.

mance characteristics of alternate paths through the Internet, including the delay and packet loss characteristics of every PoP-to-PoP route. This allows an optimum end-to-end route to be constructed from the highest performing PoP-to-PoP intermediary route segments. In comparison, default Internet routes determined by BGP do not take into account either delay or packet loss and are therefore likely to yield significantly lower performance.

The application acceleration servers deployed for IP Application Delivery services can potentially exploit all of the acceleration techniques described above for acceleration of web-based applications, but can also support a broad range of additional application-specific WOC functions because of the presence of the central site Application Acceleration servers. For application specific functions, the Application Delivery servers in the local PoPs and Application Delivery servers at the central site CPE form pairs of cooperating

• TCP transport optimization among PoPs, mitigat-

WOCs analogous to the pair of enterprise WOCs depicted

ing the inefficiencies stemming from the slow

in Figure 6.1. The central site CPE servers can also be

start algorithm and the retransmission behavior of

used to extend the Dynamic Route Optimization and

TCP. The beneficial impact of TCP optimization is

Transport Optimization functions to include the central site

greatly magnified when the delay and packet loss

as well as the MSP PoPs. This can be especially benefi-


46


cial where the central

Application Acceleration Servers

site maintains diverse connectivity to one or

MSP PoPs

Application Acceleration Servers

Application Servers

more ISPs. The central site Application Acceleration

ISP

serv-

ers can also perform AFE functions such support

WAN Router

SOHO Router ISP

as SSL processing to

ISP

Internet

AFE/ Firewall


secure

access to enterprise Mobile User

applications.

Figure 6.3: Internet-Based IP Application Delivery Services

Selection Criterion The beginning of this Chapter listed a number of benefits that an IT organization may gain from using an MSP for application delivery. These benefits are criteria that IT organizations can use as part of their evaluation of ADMSPs. For example, IT organizations should evaluate the degree to which using a particular ADMSP would allow them to lower their total cost of ownership or leverage that ADMSP’s management processes. The choice between a site-based application delivery

Independent of whether an IT organization is evaluating a site-based service or an Internet-based service, they should consider the following criteria: • Is the MSP offering a turnkey solution with simple pricing? • Does the MSP provide network and application performance monitoring? • Does the MSP provide a simple to understand management dashboard?

service and an Internet-based application delivery service

• What functionality does the MSP have to trouble-

is in part an architecture decision. As noted, site-based

shoot problems in both a proactive and a reactive

application delivery services require CPE at all of the cus-

fashion?

tomer sites. Internet-based application delivery services do not require CPE at the remote sites and may or may not require CPE at the central site. Instead, Internet-based application delivery services implement optimization functionality within the network. The choice between a sitebased application delivery service and an Internet-based application delivery service is also impacted by the ability of the Internet-based application delivery service to provide acceptable performance.

• What professional services (i.e., assessment, design and planning, performance analysis and optimization, implementation) are available? • What technologies are included as part of the service? • What is the impact of these technologies on network and application performance? • Does the MSP offer application level SLAs?


47


• What is the scope of the service? Does it include application management? Server management? • Is it possible to deploy a site-based application delivery service and not deploy WAN services from the same supplier?

Management

The Consulting Architect commented that, within his company, the end user and not the IT organization usually first finds application-performance issues. He stated that once a problem has been reported that identifying the root cause of the problem bounces around within the IT organization and that “It’s always assumed to be the network. Most of my job is defending the network.” As part of that survey, Kubernan also asked the survey

Introduction The primary management tasks associated with application delivery are to: • Discover the applications running over the network and identify how they are being used.

respondents to indicate what component of IT was the biggest cause of application degradation. Figure 7.1 summarizes their answers. In Figure 7.1, the answer shared equally means that multiple components of IT are equally likely to cause application degradation.

• Gather the appropriate management data on the performance of the applications and the infrastructure that supports them. • Provide end-to-end visibility into the ongoing perfor-

Shared Equally

Netw ork

mance of the applications and the infrastructure. Storage

• Identify the sources of delay in the performance of the applications and the infrastructure. • Automatically identify performance issues and resolve them. • Gain visibility into the operational architecture and dynamic behavior of the network. As Chapter 2 mentioned, Kubernan asked more than 300 IT professionals: “If the performance of one of your company’s key applications is beginning to degrade who notices it first?

The end user or the IT organization?”

Three-quarters of the survey respondents indicated that it was the end user. IT organizations will not be considered successful with application delivery as long as the end user, and not the IT organization, first notices application degradation.

Servers

Middlew are

Application

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

Figure 7.1: Causes of Application Degradation

The data in Figure 7.1 speaks to the technical complexity associated with managing application performance. When an application experiences degradation, virtually any component of IT could be the source of the problem.

The Organizational Dynamic To understand how IT organizations respond to application degradation, Kubernan asked several hundred IT pro-


48


fessionals to identify which organization or organizations

issue. He stated, “We used to have a real problem with

has responsibility for the ongoing performance of applica-

identifying performance problems. We would have to run

tions once they are in production. Table 7.1 contains their

around with sniffers and other less friendly tools to trouble

answers.

shoot problems. The finger pointing was often pretty bad.” He went on to say tha to do a better job of identify-

Group


Network Group – including the NOC

64.6%

Application development group

48.5%

some of their own tools. The traditional IT infrastructure

Server group

45.1%

Storage group

20.9%

groups as well as by some of the application teams are

Application performance-management group

18.9%

Other

12.1%

to develop credibility for the networking organization with

No group

6.3%

the applications-development organization.

Table 7.1: Organization Responsible for Application Performance

Kubernan recently asked over 200 IT professionals “How

ing performance problems the IT organization developed

using the tools that his organization developed. He went on to say that the reports generated by those tools helped

To be successful with application delivery, IT organizations need tools and processes that

would you characterize the current relationship between

can identify the root cause of application deg-

your company’s application development organization and

radation and which are accepted as valid by the

the network organization?

entire IT organization.

“

Their responses are

depicted in Table 7.2. Response

In order to put the technical and organizational complexPercentage of Respondents

ity that is associated with application delivery into context,

Highly adversarial

0.0%

Moderately adversarial

7.9%

Slightly adversarial

17.2%

is degrading, how difficult is it for you to identify the root

Neutral

33.0%

cause of the degradation? An answer of neutral means

Slightly cooperative

13.3%

Moderately cooperative

24.6%

that identifying the root cause of application degradation is

Highly cooperative

3.9%

Table 7.2: The Relationship between IT Groups

In roughly twenty-five percent of companies there is an adversarial relationship between the applications development groups and the network organization. The data in Tables 7.1 and 7.2 speaks to the organizational dynamic that is associated with managing application performance. Taken together with the data in Figure 7.1, managing application performance clearly is complex, both technically and organizationally.

Kubernan asked 200 IT professionals “When an application

as difficult as identifying the root cause of a network outage.” Their responses are contained in Table 7.3. Identifying the root cause of application degradation is significantly more difficult than identifying the root cause of a network outage. The good news is that most IT organizations recognize the importance of managing application performance. In particular, research conducted by Kubernan indicates that in only 2% of IT organizations is managing application performance losing importance. In slightly over half of the IT organizations, it is gaining in importance and keeping about the same importance in the rest of the IT organizations.

The ASP Architect provided insight into the challenges of determining the source of an application-performance


49


The Process Barriers Kubernan asked hundreds of IT professionals if

Extremely Easy: 1

2

3

Neutral: 4

5

6

Extremely Difficult: 7

their companies have a formalized set of processes

1.6%

for identifying and resolving application degrada-

Table 7.3: The Difficulty of Identifying the Cause of Application Degradation

6.0%

12.0%

23.4%

23.9%

25.0%

8.2%

tion. Table 7.4 contains their answers. The data in Table 7.4 clearly indicate that the majority of IT

developing these processes. The data in Table 7.5, howev-

organizations either currently have processes, or soon will,

er, indicate that, in many cases, these processes are inad-

to identify and resolve application degradation.

equate. The next Chapter will discuss the interest on the


Response Yes, and we have had these processes for a while

22.4%

Yes, and we have recently developed these processes

13.3%

No, but we are in the process of developing these processes

31.0%

No

26.2%

Other

7.1%

part of IT organizations to leverage ITIL (IT Infrastructure Library) to develop more effective IT processes. Organizational discord and ineffective processes are at least as much of an impediment to the successful management of application performance as are technology and tools. The ASP Architect stated that the infrastructure component of the IT organization has worked hard to improve their

Table 7.4: Existence of Formalized Processes

processes in general, and improving its communications

Kubernan gave the same set of IT professionals a set of

with the business units in particular. He pointed out that

possible answers and asked them to choose the two most

the infrastructure is now ISO certified and it is adopting

significant impediments to effective application delivery.

an ITIL model for problem tracking. These improvements

Table 7.5 shows the answers that received the highest

have greatly enhanced the reputation of the infrastructure

percentage of responses.

organization, both within IT and between the infrastrucPercentage of Companies

ture organization and the company’s business units. It

Our processes are inadequate

39.6%

has reached the point that the applications-development

The difficulty in explaining the causes of application degradation and getting any real buy-in

33.6%

groups have seen the benefits and are working, with the

Answer

help of the infrastructure organization, to also become ISO

Our tools are inadequate

31.5%

The application development group and the rest of IT have adversarial relations.

24.2%

Table 7.5: Impediments to Effective Application Delivery

certified.

Discovery Chapter 4 of this handbook commented on the importance of identifying which applications are running on the

The data in Table 7.5 indicates that three out of the top

network as part of performing a pre-deployment assess-

four impediments to effective application delivery have

ment. Due to the dynamic nature of IT, it is also important

little to do with technology. The data in this table also

to identify which applications are running on the network

provides additional insight to the data in Table 7.4. In par-

on an ongoing basis.

ticular, the data in Table 7.4 indicates that the vast majority of IT organization either have formalized processes for identifying and resolving application degradation, or are

Chapter 3 mentioned one reason why identifying which applications are running on the network on an ongoing basis is important: successful application delivery requires


50


that IT organizations are able to eliminate the applications

to 1023 are reserved for privileged system-level services

that are running on the network and have no business

and are designated as well-known ports. A well-known

relevance.

port serves as a contact point for a client to access a

To put this in context, Figure 7.2 shows a variety of recreational applications along with how prevalent they are.

In a recent survey of IT professionals, 51 percent

said they had seen unauthorized use of their company’s

particular service over the network. For Example, port 80 is the well-known port for HTTP data exchange and port 443 is the well-known port for secure HTTP exchanges via HTTPS.

network for applications such as Doom or online poker.

Since servers listen to port 80 expecting to receive data

Since IT professionals probably don’t see all the instances

from Web clients a firewall can’t block port 80 without

of recreational traffic on their networks, the occurrence of

eliminating much of the traffic a business may depend

recreational applications is likely higher than what Figure

on. Taking advantage of this fact, many applications will

7.2 reflects.

port-hop to port 80 when their normally assigned ports are blocked by a firewall. This behavior creates what will be referred to as the port 80 black hole. Lack of visibility into the traffic that transits port 80 is a major vulnerability for IT organizations. The port 80 black hole can have four primary effects on an IT organizations and the business it serves:

Figure 7.2: Occurrence of Recreational Applications

These recreational applications are typically not related to the ongoing operation of the enterprise and, in many cases, these applications consume a significant amount of bandwidth.

The Port 80 Black Hole As noted, identifying the applications that are running on a network is a critical part of managing application performance. Unfortunately, there are many applications whose behavior makes this a difficult task; in particular, those that use port hopping to avoid detection. In IP networks, TCP and UDP ports are endpoints to logi-

• Increased vulnerability to security breaches • Increased difficulty in complying with government and industry regulations • Increased vulnerability to charges of copyright violation • Increased difficulty in managing the performance of key business-critical, time-sensitive applications

Port Hopping Two applications that often use port hopping are instant messaging (IM) and peer-to-peer (P2P) applications such as Skype.

cal connections and provide the multiplexing mechanism

Instant Messaging

to allow multiple applications to share a single connection

A good example of this is AOL’s Instant Messenger

to the IP network. Port numbers range from 0 to 65535.

(AIM). AOL has been assigned ports 5190 through 5193

As described in the IANA (Internet Assigned Numbers

for its Internet traffic, and AIM is typically configured to

Authority) Port Number document (www.iana.org/assign-

use these ports. If these ports are blocked, however, AIM

ments/port-numbers), the ports that are numbered from 0

will use port 80. As a result, network managers might well


51


think that by blocking ports 5190 – 5193 they are blocking

port that they use each time they start. Consequently,

the use of AIM when in reality they are not.

there is no standard ‘Skype port’ like there is a ‘SIP port’

The point of discussing AIM is not to state whether or not a company should block AIM traffic. That is a policy decision that needs to be made by the management of the company. Some of the reasons why a company might choose to block AIM include security and compliance. AIM can present a security risk because it is an increasingly popular vector for virus and worm transmission. As for compliance, a good example is the requirement by the Securities and Exchange Commission that all stock brokers keep complete records of all communications with clients. This requires that phone calls be recorded, and both email and IM archived. However, if AIM traffic is flowing through port 80 along with lots of other traffic, most network organizations will not even be aware of its existence.

or ‘SMTP port’. In addition, Skype is particularly adept at port-hopping with the aim of traversing enterprise firewalls. Entering via UDP, TCP, or even TCP on port 80, Skype is usually very successful at passing typical firewalls. Once inside, it then intentionally connects to other Skype clients and remains connected, maintaining a ‘virtual circuit’. If one of those clients happens to be infected, then the machines that connect to it can be infected with no protection from the firewall. Moreover, because Skype has the ability to port-hop, it is much harder to detect anomalous behavior or configure network security devices to block the spread of the infection.” FIX-Based Applications Another component of the port 80 black hole is the existence of applications that are designed to use port 80 but which require more careful management that the

Peer-to-Peer Networks and Skype A peer-to-peer computer network leverages the connectivity between the participants in a network. Unlike a typical client-server network where communication is typically to and from a central server along fixed connections, P2P nodes are generally connected via largely ad hoc connections. Such networks are useful for many purposes, including file sharing and IP telephony.

typical port 80 traffic. A good example of this is virtually any application that is based on the Financial Information eXchange (‘FIX’) protocol. The FIX protocol is a series of messaging specifications for the electronic communication of trade-related messages. Since its inception in 1992 as a bilateral communications framework for equity trading between Fidelity Investments and Salomon Brothers, FIX has become the de-facto messaging standard for

Skype is a peer-to-peer based IP telephony and IP

pre-trade and trade communications globally within equity

video service developed by Skype Technologies SA. The

markets, and is now experiencing rapid expansion into the

founders of Skype Technologies SA are the same people

post-trade space, supporting Straight-Through-Processing

who developed the file sharing application Kazaa. Many

(STP) from Indication-of-Interest (IOI) to Allocations and

IT organizations attempt to block peer-to-peer networks

Confirmations.

because they have been associated with distributing con-

increased momentum as it begins to be used across the

tent in violation of copyright laws.

Foreign Exchange, Fixed Income, and Derivative markets.

Many security experts have warned about the dangers associated with peer-to-peer networks. 17

Antonio Nucci

The use of the protocol is gathering

In our industry we often overuse the phrase business-

For example,

critical. However, the claim can easily be made that the

wrote “In order to avoid detection, many

applications that support the business functions described

peer-to-peer applications, including Skype, change the 17 Skype: The Future of Traffic Detection and Classification

http://www.pipelinepub.com/0906/VC1.html

above are indeed business-critical.

Analogously, there

is often a lot of subjectivity relative to whether or not an application is time-sensitive. Again, that is not the case


52


for the applications that support the business functions

tions the IT organization to curb recreational use of

described above. For example, if a stock broker is plac-

the network.

ing an order for millions of dollars in stocks, a small delay in placing the order can significantly drive up the cost of the stock. That is a textbook definition of a time-sensitive application.

End-to-End Visibility Our industry uses the phrase end-to-end visibility in various ways. Given that one of this handbook’s major themes is that IT organizations need to implement an applicationdelivery function that focuses directly on applications and not on the individual components of the IT infrastructure,

• Allows the IT organization to measure the performance of critical applications before, during and after it makes changes. These changes could be infrastructure upgrades, configuration changes or the deployment of a new application. As a result, the IT organization is in a position both to determine if the change has had a negative impact and to isolate the source of the problem it can fix the problem quickly. • Enables better cross-functional collaboration. As

this handbook will use the following definition of end-to-

section 7.2 discussed, having all members of the IT

end visibility.

organization have access to the same set of tools

End-to-end visibility refers to the ability of the IT organization to examine every component of IT that impacts the communications once users hit ENTER or click the mouse to when they receive a response from an application. End-to-end visibility is one of the cornerstones of assuring acceptable application performance. End-to-end visibility is important because it: • Provides the information that allows IT organizations to notice application performance degradation before the end user does. • Identifies the correct symptoms of the degradation and as a result enables the IT organization to reduce the amount of time it takes to remove the sources of the application degradation. • Facilitates making intelligent decisions and getting

that are detailed and accurate enough to identify the sources of application degradation facilitates cooperation. The type of cross-functional collaboration the preceding bullet mentioned is difficult to achieve if each group within IT has a different view of the factors causing application degradation. To enable cross-functional collaboration, it must be possible to view all relevant management data from one place. Providing detailed end-to-end visibility is difficult due to the complexity and heterogeneity of the typical enterprise network. The typical enterprise network, for example, is comprised of switches and routers, firewalls, application front ends, optimization appliances, intrusion detection and intrusion prevention appliances as well as a virtualized network. An end-to-end monitoring solution must profile

buy-in from other impacted groups. For example,

traffic in a manner that reflects not only the physical net-

end-to-end visibility provides the hard data that

work but also the logical flows of applications, and must be

enables an IT organization to know that it has to add

able to do this regardless of the vendors who supply the

bandwidth or redesign some of the components

components or the physical topology of the network.

of the infrastructure because the volume of traffic associated with the company’s sales order tracking application has increased dramatically. It also posi-

As Chapter 4 discussed, IT organizations typically have easy access to management data from both SNMP MIBs


53


and from NetFlow. IT organizations also have the option of deploying either dedicated instrumentation or software agents to gain a more detailed view into the types of applications listed below. An end-to-end visibility solution should be able to identify: • Well known applications; e.g.. FTP, Telnet, Oracle, HTTPS and SSH. • Complex applications; e.g., SAP and Citrix Presentation Server. • Applications that are not based on IP; e.g., applications based on IPX or DECnet. • Custom or homegrown applications. • Web-based applications. • Multimedia applications. Other selection criteria include the ability to: • Scale as the size of the network and the number of applications grows. • Provide visibility into virtual networks such as ATM PVCs and Frame Relay DLCIs.

• Support flexible aggregation of collected information. • Provide visibility into complex network configurations such as load-balanced or fault-tolerant, multichannel links. • Support the monitoring of real traffic. • Generate and monitor synthetic transactions.

Network and Application Alarming Static Alarms Historically, one of the ways that IT organizations attempted to manage performance was by setting static threshold performance-based alarms. In a recent survey, for example, roughly three-quarters (72.8%) of the respondents said they set such alarms. The survey respondents were then asked to indicate the network and application parameters against which they set the alarms. Table 7.6 contains their answers to that question. Survey Respondents were instructed to indicate as many parameters as applied to their situation. Parameter

Percentage

WAN Traffic Utilization

81.5%

Network Response Time (Ping, TCP Connect)

58.5%

• Add minimum management traffic overhead.

LAN Traffic Utilization

47.8%

• Support granular data collection.

Application-Response Time (Synthetic Transaction Based)

30.2%

Application Utilization

12.2%

Other

5.9%

• Capture performance data as well as events such as a fault. • Support a wide range of topologies both in the access, distribution and core components of the network as well as in the storage area networks. • Provide visibility into encrypted networks. • Support real-time and historical analysis. • Integrate with other management systems.

Table 7.6: Percentage of Companies that Set Specific Thresholds

As Table 7.6 shows, the vast majority of IT organizations set thresholds against WAN traffic utilization or some other network parameter. Less than one-third of IT organizations set parameters against application-response time. Many companies that set thresholds against WAN utilization use a rule of thumb that says network utilization should not exceed 70 or 80 percent. Companies that use


54


this approach to managing network and application perfor-

One conclusion that can be drawn from Table 7.5 is that the vast majority of IT organizations set the thresholds

mance implicitly make two assumptions: 1. If the network is heavily utilized, the applications are performing poorly.

high to minimize the number of alarms that they receive. While this approach makes sense from operationally, it leads to an obvious conclusion.

2. If the network is lightly utilized, the applications are performing well.

Most IT organizations ignore the majority of the performance alarms.

The first assumption is often true, but not always. For example, if the company is primarily supporting email or bulk file transfer applications, heavy network utilization is unlikely to cause unacceptable application performance. The second assumption is often false. It is quite possible to have the network operating at relatively low utilization levels and still have the application perform poorly. An example of this is any application that uses a chatty proto-

Proactive Alarms As noted, most IT organizations implement static performance alarms by setting thresholds at the high water mark.

This means that the use of static performance

alarms is reactive.

The problems static performance

alarms identify are only identified once they have reached the point where they most likely impact users.

col over the WAN. In this case, the application can perform

The use of static performance alarms has two other limi-

badly because of the large number of application turns,

tations. One is that the use of these alarms can result in a

even though the network is exhibiting low levels of delay,

lot of administrative overhead due to the effort required to

jitter and packet loss.

initially configure the alarms, as well as the effort needed

Application management should focus directly on the application and not just on factors that have the potential to influence application performance.

date the constantly changing environment. Another limitation of the use of these alarms is accuracy. In particular, in many cases the use of static performance alarms can result in an unacceptable number of false positives and/or

The Survey Respondents were also asked to indicate the approach that their companies take to setting performance thresholds. Table 7.7 contains their answers. Approach

to keep up with tuning the settings in order to accommo-

false negatives. Proactive alarming is sometimes referred to as network analytics. The goal of proactive alarming is to automatically

Percentage of Companies

identify and report on possible problems in real time so that organizations can eliminate them before they impact users.

We set the thresholds at a high-water mark so that we only see severe problems.

64.3%

We set the thresholds low because we want to know every single abnormality that occurs.

18.3%

concepts of baselining, which Chapter 4 describes, and

Other (Please specify).

17.4%

applies these concepts to real-time operations.

Table 7.7: Approach to Setting Thresholds

Of the Survey Respondents that indicated other, their most common responses were that their companies set the thresholds at what they consider to be an average value.

One key concept of proactive alarming is that it takes the

A proactive alarming solution needs to be able to baseline the network to identify normal patterns and then identify in real time a variety of types of changes in network traffic. For example, the solution must be able to identify a spike in traffic, where a spike is characterized by a change that is


55


both brief and distinct. A proactive alarming solution must

into the operational architecture and dynamic behavior of

also be able to identify a significant shift in traffic as well

the network.

as the longer-term drift. Some criteria organizations can use to select a proactive alarming solution include that the solution should: • Operate off real-time feeds of performance metrics. • Not require any threshold definitions. • Integrate with any event console or enterprise-management platform. • Self-learn normal behavior patterns, including hourly and daily variations based on the normal course of user community activities. • Recognize spike, shift and drift conditions. • Discriminate between individual applications and users. • Discriminate between physical and virtual network elements. • Collect and present supporting diagnostic data along with alarm. • Eliminate both false positive and false negative alarms.

Route Analytics As Chapter 3 mentions, many organizations have moved away from a simple hub-and-spoke network topology and have adopted either a some-to-many or a many-to-many topology. By the nature of networks that are large and which have complex network topologies, it is not uncommon for the underlying network infrastructure to change, experience instabilities, and to become mis-configured. In addition, the network itself is likely designed in a sub-optimum fashion. Any or all of these factors have a negative impact on application performance. As a result, an organization that has a large complex network needs visibility

One of the many strengths of the Internet Protocol (IP) is its distributed intelligence. For example, routers exchange reachability information with each other via a routing protocol such as OSPF (Open Shortest Path First). Based on this information, each router makes its own decision about how to forward a packet. While this distributed intelligence is a strength of IP, it is also a weakness. In particular, while each router makes its own forwarding decision, there is no single repository of routing information in the network. The lack of a single repository of routing information is an issue because routing tables are automatically updated and the path that traffic takes to go from point A to point B may change on a regular basis. These changes may be precipitated by a manual process such as adding a router to the network, the mis-configuration of a router or by an automated process such as automatically routing around a failure. In this latter case, the rate of change might be particularly difficult to diagnose if there is an intermittent problem causing a flurry of routing changes typically referred to as route flapping. Among the many problems created by route flapping is that it consumes a lot of the processing power of the routers and hence degrades their performance. The variability of how the network delivers application traffic across its multiple paths over time can undermine the fundamental assumptions that organizations count on to support many other aspects of application delivery. For example, routing instabilities can cause packet loss, latency, and jitter on otherwise properly configured networks. In addition, alternative paths might not be properly configured for QoS. As a result, applications perform poorly after a failure. Most importantly, configuration errors that occur during routine network changes can cause a wide range of problems that impact application delivery. These configuration errors can be detected if planned network changes can be simulated against the production network.


56


Factors such as route flapping can be classified as logical as compared to a device specific factor such as a link outage. However, both logical and device-specific factors impact application performance. To quantify how often a logical factor vs. a device specific factor causes an application delivery issue, 200 IT professionals were given the

those survey respondents who provided an answer other than don’t know. Logical factors are almost as frequent a source of application performance and availability issues as are device-specific factors. SNMP-based management systems can discover and

following survey question: “Some of the factors that impact application performance and availability are logical in nature. Examples of logical factors include sub-optimal routing, intermittent instability or slowdowns, and unanticipated network behavior. In contrast, some

display the individual network elements and their physical or Layer 2 topology, however they cannot identify the actual routes packets take as they transit the network. As such, SNMP-based systems cannot easily identify problems such as route flaps or mis-configurations.

of the factors that impact application performance

The preceding section used the phrase network analytics

and availability are device specific. Examples of

as part of the discussion of proactive alarming. Network

device specific factors include device or interface

analytics and route analytics have some similarities. For

failures, device out of memory condition or a failed

example, each of these techniques relies on continuous,

link. In your organization, what percentage of the

real-time monitoring. Whereas the goal of network analyt-

time that an application is either unavailable or is

ics is to overcome the limitation of setting static perfor-

exhibiting degraded performance is the cause logi-

mance thresholds, the goal of route analytics is to provide

cal? Is the cause device specific?

visibility, analysis and diagnosis of the issues that occur at

The responses to that question are contained in the middle column of the following table.

the routing layer. A route analytics solution achieves this goal by providing an understanding of precisely how IP networks deliver application traffic. This requires the creation


Percentage of Respondents Removing “ don’t know”

Less than 10% logical vs. 90% device specific

19.5%

26.8%

Up to 30% logical vs. 70% device specific

22.1%

30.4%

50% logical, 50% device specific

10.5%

14.5%


11.6%

15.9%


8.9%

12.3%

Don’t know

27.4%

Table 7.8: Impact of Logical vs. Device Specific Factors

As Table 7.8 shows, a high percentage of survey respondents answered don’t know. To compensate for this, the

and maintenance of a map of network-wide routes and of all of the IP traffic flows that traverse these routes. This in turn means that a route analytics solution must be able to record every change in the traffic paths as controlled and notified by IP routing protocols. By integrating the information about the network routes and the traffic that flows over those routes, a route analytics solution can provide information about the volume, application composition and class of service (CoS) of traffic on all routes and all individual links. This network-wide, routing and traffic intelligence serves as the basis for: • Real-time monitoring of the network’s Layer 3 operations from the network’s point of view.

far right column of Table 7.8 reflects the responses of


57


• Historical analysis of routing and traffic behavior as well as for performing a root causes analysis. • Modeling of routing and traffic changes and simulating post-change behavior. The key functional components in a route analytics solution are: • Listening to and participating in the routing protocol exchanges between routers as they communicate with each other.

analytics solution has the potential to provide benefits to IT organizations is when those IT organizations use MPLS services provided by a carrier who uses a route analytics solution. One reason that a route analytics solution can provide value to MPLS networks is that based on the scale of a carrier’s MPLS network, these networks tend to be very complex and hence difficult to monitor and manage. The complexity of these networks increases when the carrier uses BGP (Border Gateway Protocol) as BGP is itself a complex protocol. For example, a mis-configuration in BGP can result in poor service quality and reach-

• Computing a real-time, network-wide routing map.

ability problems as the routing information is transferred

This is similar in concept to the task performed by

between the users’ CE (Customer Edge) routers to the

individual routers to create their forwarding tables.

service provider’s PE (Provider Edge) routers.

However, in this case it is computed for all routers.

Route analytics can also be useful in simulating and

• Mapping Netflow traffic data, including application

analyzing the network-wide routing and traffic impact of

composition, across all paths and links in the map.

various failure scenarios as well as the impact of planned

• Monitoring and displaying routing topology and traffic flow changes as they happen. • Detecting and alerting on routing events or failures as routers announce them, and reporting on correlated traffic impact.

network changes such as consolidating servers out of branch offices, or implementing new WAN links or router hardware. The purpose of this simulation is to ensure that the planned and unplanned changes will not have a negative effect on the network. Two hundred IT professionals were given the following

• Correlating routing events with other information,

question: “Sometimes logical problems such as routing

such as performance data, to identify underlying

issues are the source of application degradation and appli-

cause and effect.

cation outages. Which of the following describes how you

• Recording, analyzing and reporting on historical routing and traffic events and trends.

resolve those types of logical issues?” Their answers are shown in Table 7.9.

• Simulating the impact of routing or traffic changes on the production network. One instance in which a route analytics solution has the potential to provide benefits to IT organizations occurs when the IT organization runs a complex private network. In this case, it might be of benefit to the IT organization to take what is likely to be a highly manual process of monitoring and managing routing and to replace it with a highly

Approach


Lots of hard work – typically by digging deeply into each device

38.7%

Employee specific tools such as route analytics

24.9%

N/A or don’t know

19.9%

Waiting for it to happen again and trying to capture it in real time

13.3%

Other

3.3%

Table 7.9: Resolving Logical Issues

automated process. Another instance in which a route


58


As table 7.9 shows, many IT organizations still rely on laborious manual processes, or simply hope to be able to

Measuring Application Performan ce Evaluating application performance has been used in tra-

catch recurrences of issues for which there are no obvious

ditional voice communications for decades. In particular,

physical/device explanations. This indicates that most net-

evaluating the quality of voice communications by using a

work management toolsets lack the ability to address the

Mean Opinion Score (MOS) is somewhat common.

logical issues that route analytics tools are useful for. In particular, the ability of route analytics to rewind the entire recorded history of network-wide routing and traffic helps network engineers to be able to look into logical issues as if they were seeing them currently. This level of automation can greatly speed problem localization and root cause analysis. Since many logical problems exhibit symptoms only intermittently, getting to the root of these problems

The Mean Opinion Score is defined in “Methods for Subjective Determination of Voice Quality (ITU-T P.800).” As that title suggests, a Mean Opinion Score is a result of subjective testing in which people listen to voice communications and place the call into one of five categories. Table 7.10 depicts those categories, and the numerical rating associated with each.

rather than hoping to solve them in the future also can

MOS

Speech Quality

help increase the overall stability of application delivery

5

Excellent

and performance.

4

Good

One criterion that an IT organization should look at when selecting a route analytics solution is the breadth of rout-

3

Fair

2

Poor

1

Bad

ing protocol coverage. For example, based on the envi-

Table 7.10: Mean Opinion Scores and Speech Quality

ronment, the IT organization might need the solution to

A call with a MOS of 4.0 or higher is deemed to be of

support of protocols such as OSPF, IS-IS, EIGRP, BGP and MPLS VPNs. Another criterion is that the solution should be able to collect data and correlate integrated routing and Netflow traffic flow data. Ideally, this data is collected and reported on in a continuous real-time fashion and is also stored in such a way that it is possible to generate meaningful reports that provide an historical perspective on the performance of the network. The solution should also be aware of both application and CoS issues, and be able to integrate with other network management components. In particular, a route analytics solution should be capable of being integrated with network-agnostic application performance management tools that look at the endpoint computers that are clients of the network, as well as with traditional network management solutions that provide

toll quality. To increase objectivity, the ITU has developed another model of voice quality. Recommendation G.107 defines this model, referred to as the E-Model. The E-Model is intended to predict how an average user would rate the quality of a voice call. The E-Model calculates the transmission-rating factor R, based on transmission parameters such as delay and packet loss. Table 7.1118 depicts the relationship between R-Values and Mean Opinion Scores. R-Value

Characterization

90 - 100

Very Satisfied

MOS 4.3+

80 – 90

Satisfied

4.0 – 4.3

70 - 80

Some Users Dissatisfied

3.6 – 4.0

insight into specific points in the network; i.e., devices,

60 – 70

Many Users Dissatisfied

3.1 – 3.6

interfaces, and links.

50 – 60

Nearly All Users Dissatisfied

2.6 – 3.1

0 – 60

Not Recommended

1.0 – 2.6

Table 7.11: Comparison of R-Values and Mean Opinion Scores 18 Overcoming Barriers to High-Quality Voice over IP Deploy-

ments, Intel


59


A number of vendors have begun to develop applica-

In addition to these protocol challenges, VoIP is extreme-

tion-performance metrics based on a somewhat similar

ly sensitive to a number of network parameters that

approach to the ITU E-Model. For example, the Apdex

have much less affect on transactional applications. For

19

Alliance

is a group of companies collaborating to pro-

example users expect 100% availability and immediate

mote an application-performance metric called Apdex

dial tone. In addition, fairly low levels of packet loss can

(Application Performance Index) which the alliance states

severely impact voice quality.

is an open standard that defines a standardized method to

critical. At about 150 ms, voice quality will likely begin

report, benchmark and track application performance.

to degrade, and beyond 250 ms it will almost certainly

Managing VoIP

End-to-end delay is also

be unusable. These are levels of latency that are barely noticeable on most transactional applications. Jitter, which

The preceding section discussed the use of MOS to

is the variation in arrival time from packet to packet, can

measure the quality of VoIP. This section will expand upon

also negatively impact voice quality. Most network man-

that discussion and describe what it takes to successfully

agement solutions don’t measure jitter because while this

manage VoIP.

is a key parameter for VoIP, it has virtually no impact on the

VoIP Characteristics VoIP poses particular challenges to the network for two

typical data application.

Managing VoIP Effectively

primary reasons. Those reasons are the new and different

The combination of the user expectation of 100% uptime

protocols that VoIP requires as well as its stringent avail-

in voice, its sensitivity to network conditions, and its cross-

ability and performance requirements. For instance, there

domain organizational demands mandate an integrated

are many different coding algorithms (codecs) available to

approach to network management, both organizationally

handle the task of converting a conversation from analog

and technically. IT organizations should avoid the all-too-

to digital and back to analog again. Both sides of the call

common fragmented approach to network management,

must use the same codec. The negotiation to ensure this

which generally results from the incremental adoption of

is handled by another set of protocols involved with call

point solutions to address new problems on an ad hoc

setup, such as H.323, the Media Gateway Control Protocol

basis.

(MGCP), Cisco’s Skinny Client Control Protocol (SCCP), and increasingly, the Session Initiation Protocol (SIP).

Instead, IT organizations should look for an integrated solution that relates voice-specific metrics such as MOS

A critical concern is that because of its real-time nature,

values to the underlying network behavior that influences

VoIP almost universally relies on UDP rather than TCP. This

them, and vice-versa. To deliver this integration, a voice

poses particular problems for voice management, because

management solution must, at a minimum, deliver infor-

unlike TCP, UDP does not offer any feedback information

mation from three sources: call signaling, NetFlow, and

about whether or not packets that have been sent have

SNMP, and, even more important, relate them one to

been received. In addition, UDP does not have any flow

another.

control mechanisms to limit transmissions in the presence of congestion.

Call signaling (or call setup), is handled by one of a number of different protocols: either one of those standards noted above (H.323, MGCP, SCCP, or SIP), or a proprietary

19 http://www.apdex.org/index.html

protocol. The ability of a solution to monitor call setup is


60


critical. During call setup, the two ends of the conversation

console to the reports the solution generates, what is

negotiate a common codec, establish the channels that

required is a holistic overview that can relate voice quality

will be used for transmitting and receiving, and generate

to network performance, and vice-versa.

a number of status codes. This information can be used

the IT organization should be able to detect that MOS

to derive important measurements like delay to dial tone.

values are dropping on the link between HQ and the Los

Without this data, IT organizations won’t know what went

Angeles office and bring up management data from the

wrong if users, for example, start complaining that they

appropriate devices to check the traffic composition on the

can’t get a dial tone.

link. The IT organization should be able to use this data

Being able to monitor call signaling also implies being able to receive and integrate data from an IP PBX. This is particularly important for monitoring voice-specific metrics such as MOS. In order to relate this VoIP-specific data to network conditions requires network-specific data. Both NetFlow and RMON-2 data can give IT organizations insight into the protocol and class-of-service composition of the traffic. And, given the increasingly meshed nature of VoIP systems, the ability of NetFlow or RMON-2 to deliver data from many points in the network can be critical in managing VoIP. SNMP is also a requirement. Not only does it deliver data on the health of the devices, but it can also be used to access data from other sources. For example, in a Ciscobased network, SNMP can give information from both the Cisco IP SLA and the Cisco Class-Based QoS (CBQOS) MIB. Cisco IP SLA generates synthetic transactions that can be used to emulate voice traffic across important links and derive metrics critical to understanding voice quality. The CBQOS MIB provides information about the classbased queuing mechanism in a Cisco router, enabling IT organizations to ensure that their critical traffic is being treated appropriately when bandwidth is in short supply. However, in order for IT organizations to get real time scores for calls in progress, advanced monitoring tools that measure the delay, jitter, loss and MOS for actual voice calls is required. Once all of this data has been collected IT organizations must have a way of integrating it all into a useful overview of voice and network performance. From the management

For example,

to answer questions such as: Is the link being flooded by packets from a scheduled backup, or rogue traffic from an illicit application? What other critical applications are being affected? It is worth pointing out that NetFlow data may not be sufficient to answer those questions. That follows because of the port 80 black hole that was discussed previously in this chapter. In particular, since NetFlow is port based many of the NetFlow records will list port 80 as the application. As previously described, a growing number of applications use port 80.

The Changing Network Management Function Introduction The preceding chapter discussed the organizational complexity and the process barriers associated with ensuring acceptable application performance.

This chapter will

describe the changing role of one of the key players in the application delivery function – the Network Operations Center (NOC).

As part of that description this chapter

will examine the current and emerging role of the NOC, the attempt on the part of many NOCs to improve their processes, and will highlight the shift that most NOCs are taking from where they focus almost exclusively on the availability of networks to where they are beginning to also focus on the performance of networks and applications. As mentioned, this chapter will describe how it is now somewhat common to have the NOC heavily involved in managing the performance of applications. As a result, this chapter will also examine how the NOC has to change


61

APPlicATion Delivery HAnDbook | februAry 2008

in order to reduce the meant time to repair that is associ-

In over a quarter of organizations, the NOC does

ated with application performance issues and will detail

not meet the organization’s current needs.

the ways that IT organizations justify an investment in performance management.

The Function of the NOC

Today’s NOC

amongst IT organizations. For example, The Management

The set of functions that NOCs perform varies widely Systems Manager pointed out that the NOC in which she

Perceptions of the NOC The survey respondents were asked if they thought that working in the NOC is considered to be prestigious. The NOC-associated respondents20 were evenly split on this issue. That was not the case for the Non-NOC respondents21. By roughly a 2 to 1 margin, these respondents indicated that they do not think that working in the NOC is prestigious.

works supports the company’s WAN and some LANs. They do not support LANs in those sites where the people in the sites believe that they can support the LAN more cost effectively themselves. When it comes to how the NOC functions, one of the most disappointing finding is that: In the majority of cases, the NOC tends to work

The survey respondents were asked a series of questions regarding senior IT management’s attitude towards the NOC. The results are shown in Table 8.1.

on a reactive basis identifying a problem only after it impacts end users. The survey also asked the respondents about the most

Disagree/ Tend to Disagree

common type of event that causes NOC personnel to take

Our senior IT management believes that…

Agree/ Tend To Agree

...the NOC provides value to our organization.

90.7%

9.3%

who provided a response other than “don’t know” are

...the NOC is a strategic function of IT.

87.9%

12.1%

...the NOC is capable of resolving problems in an effective manner.

82.4%

17.6%

...the NOC will be able to meet the organization’s requirements 12 months from now.

81.4%

18.6%

...the NOC works efficiently.

80.6%

19.4%

...the NOC meets the organization’s current needs.

71.9%

28.1%

action. The replies of the NOC-associated respondents depicted in Figure 8.1. The data in Figure 8.1 indicates that roughly half the time either someone in the NOC or an automated alert causes the NOC to take action. This data, however, does not address the issue of whether or not this occurs before the user is impacted.

Table 8.1: IT Management’s Perception of the NOC

Overall the data in Table 8.1 is positive. There are, however, some notable exceptions to that statement.

20NOC-associated respondents will refer to survey respondent

who work in the NOC

21Non-NOC respondents will refer to survey respondents who do

Figure 8.1: Events that Cause the NOC to Take Action

not work in the NOC


62


As was discussed in Chapter 2, the IT organization’s inabil-

• During the past 12 months, our NOC personnel have

ity to identify application degradation before it impacts the

seen the greatest increase in time spent addressing

user makes the IT organization look like bumbling idiots.

issues with…

The Medical Supplies CIO elaborated on that and stated that the biggest question he gets from the user is, “Why don’t you know that my system is down? Why do I have to tell you?” He said that the fact that end users tend to notice a problem before IT does has the affect of eroding

where each question contained a number of possible answers. Table 8.2 shows the answers of the NOC-associated respondents.

the users’ confidence in IT in general. The conventional wisdom in our industry is that NOC efficiency is reduced because of the silos that exist within the NOC.

In this context, silos means that the work-

Applications

Greatest Amount of Time

Second Greatest Amount of Time

Greatest Increase in Time

39.1%

16.9%

45.0%

Servers

14.1%

21.5%

21.7%

LAN

10.9%

15.4%

5.0%

WAN

23.4%

30.8%

11.7%

Security

9.4%

6.2%

10.0%

Just under half of NOCs are organized around

Storage

3.1%

9.2%

6.7%

functional silos.

Table 8.2: Where the NOC Spends the Most Time

A majority of NOCs use many management tools

There are many conclusions that can be drawn from the

groups have few common goals, language, processes and tools. The survey respondents validated that conventional wisdom.

that are not well integrated. The Manufacturing Analyst said that having management

data in Table 8.2, including: NOC personnel spend the greatest amount of

tools that are not well integrated “is a fact of life”. He

time on applications and that is a relatively new

added that his organization has a variety of point products

phenomenon.

and does not currently have a unified framework for these tools. Where Does the NOC Spend Most of Its Time? To identify the areas in which NOC personnel spend most of their time, the survey contained three questions: • During the past 12 months, our NOC personnel have spent the greatest amount of time addressing issues with… • During the past 12 months, our NOC personnel have

NOC personnel spend an appreciable amount of their time supporting a broad range of IT functionality. The Medical Supplies CIO said that the percentage of time his organization spends monitoring and troubleshooting network problems varies from month to month, but is probably in the range of ten to twenty percent.

What Do NOC Personnel Monitor? The Survey Respondents were asked four questions

spent the second greatest amount of time address-

about what NOC personnel in their organization monitor;

ing issues with…

the results from NOC-associated respondents are shown in Figure 8.2.


63


The responsibilities that are highlighted in Figure 8.3 are where NOC-associated and non-NOC respondents differed most in their responses. Interestingly, the areas where there were the greatest differences are all traditional networking activities.

The Use of ITIL (IT Infrastructure Library) There has been significant discussion over the last few years about using a framework such as ITIL to improve network management practices. To probe the use of ITIL, the survey respondents were asked if their organization Figure 8.2: What the NOC Monitors

either now has an IT service management process such as ITIL in place, or intended to adopt such a process within

The Manufacturing Analyst said that his organization

the next 12 months. The majority of respondents (62%)

focuses on the availability of networks. He added, how-

indicated that their organization did have such a process

ever, that there is a project underway to change how the

in place. Of those respondents who did not, a similar per-

NOC functions. The goal of the project is to create a NOC

centage (63%) believed that their organization would put

that is more proactive and which focuses both on perfor-

such a process in place within the next 12 months. The

mance and availability.

fact that 86% of respondents stated that their organization

One of the conclusions that can be drawn from the data in Figure 8.2 is:

either had or would have within 12 months a service management process in place indicates the emphasis being placed within the NOC to improve their processes.

The NOC is almost as likely to monitor performance, as it is to monitor availability. In addition, while there is still more of a focus in the NOC on networks, there is a significant emphasis on applications.

What Else Does the NOC Do? We also asked the Survey Respondents about other tasks or responsibilities that NOC personnel are involved in. Figure 8.3 shows the responses for both NOC-associated and non-NOC personnel. The most obvious conclusion that can be drawn from Figure 8.3 is that NOC personnel are involved in myriad tasks beyond simple monitoring. As shown, NOC personnel are typically involved in traditional network activities such as configuration changes, selection of new network technologies and the selection of network service providers. The NOC is less likely to be involved in application rollout and the selection of security functionality.

Figure 8.3: Tasks that the NOC is Involved In

While the survey data made it look like there was very strong interest in ITIL, the interviewees were not as enthusiastic. For example, The Medical Supplies CIO said that


64


they tried to get involved using ITIL to improve some of

stated that the help desk typically routes issues that it can-

their processes.

not resolve to the NOC. One of the reasons that the help

However, while he does not disagree

with the benefits promised by ITIL he finds that it seems too theoretical and he lacks the resources to get deeply involved with it.

desk routes so many calls to the NOC is the following. In the vast majority of instances, the assumption is that the network is the source of application

The Manufacturing Analyst stated that his organization has begun to use ITIL but that they “do not live by the [ITIL] book.” He added his belief that ITIL will make a difference, but probably not that big of a difference.

degradation. The Manufacturing Analyst stated that in his company if there is an IT problem the tendency of the user is to contact the NOC because “We have always had the tools

The Management Systems Manager stated that their

to identify the cause of the problems”. He added that

company has done a lot of ITIL training and that there is

the approach of sending most troubles directly to the

some interest in moving more in the direction of setting up a

NOC tends to increase the amount of time that it takes

CMDB (Configuration Management Data Base). However,

to resolve problems because of the added time it takes to

their ability to improve their processes is limited by their

show that the network is not at fault.

organizational structure. For example, the central IT group has set up a change management process that calls for them to get together once a week to review changes that cut across organizational silos. However, some sites control their own LANs. If one of these sites makes a change to their LAN, it does not go through the change management process set up by the central IT group.

The Management and Security Manager stated that as recently as a year ago his organization had a very defensive approach to operations with a focus on showing that the network was not the source of a trouble. His motto is “I don’t care what the problem is, we are all going to get involved in fixing it.” When asked if his motto was widely accepted within the organization he replied “Some of the

There is a lot of interest in ITIL, but it is too soon

mentality is changing, but this is still not the norm.”

to determine how impactful

The Survey Respondents were

the use of ITIL will be.

asked to indicate their degree of agreement with the statement:

Routing Troubles

“Our NOC personnel not only

The vast majority of organizations

identify problems, but are also

have at least a simple escalation pro-

involved in problem resolution.”

cess in place for problem response.

Their responses are depicted in

In particular, over 90% of Survey

Figure 8.4.

Respondents indicated that their organization has a help desk that assists

The data in Figure 8.4 is further

end users, and over 80% agree that

evidence of the fact that NOC

the help desk does a good job of

personnel do a lot more than just

routing issues that it cannot resolve

monitor networks.

to whatever group can best handle

In the majority of instances,

them. It should be noted that of the

the NOC gets involved in prob-

latter group of respondents (those agreeing), better than three-quarters

lem resolution. Figure 8.4: Role of the NOC in Problem resolution


65


Change in the NOC

The top driver of change in the NOC is the requirement to place greater emphasis on ensur-

Factors Driving Change As shown in Table 8.1, over a quarter of the total base

ing acceptable performance for key applications.

of survey respondents indicated that the NOC does not

And a related driver, the need for better visibility into

meet the organization’s current needs. This level of dis-

applications, is almost as strong a factor causing change

satisfaction with the NOC is in line with the fact that as

in the NOC.

shown in Figure 8.5, almost two thirds of the respondents indicated that their organization would attempt to make any significant changes in their NOC processes within the next 12 months.

As shown in Table 8.2, NOC personnel do not spend a lot of their time today on security. However, that may change in the next year as roughly half of the Survey Respondents indicated that combining network and security operations would impact their NOC over the next 12 months. In addition, two thirds of the Survey Respondents also indicated that a growing emphasis on security would impact their NOC over the next 12 months. The Medical Supplies CIO stated that in order to place greater emphasis on ensuring acceptable performance

Figure 8.5: Interest in Changing NOC Processes

The survey respondents were asked to indicate which factors would drive their NOC to change within the next 12 months. Their responses are shown in Figure 8.6. One clear conclusion that can be drawn from the data in Figure 8.6 is that a wide range of factors are driving change in the NOC. Given that NOC personnel spend the greatest amount of time on applications, it is not at all surprising that:

Figure 8.6: Factors Driving Change in the NOC


66


for their key applications, they have formed an application

and the need to make significant changes to NOC process-

delivery organization. He added that roughly a year ago

es have been constant themes throughout this chapter.

they first began to use some monitoring tools and that these tools “Opened our eyes to a lot of things that can degrade and not cause any of the traditional green lights to turn yellor or red.” He stated that on a going forward basis he wants to place more emphasis on security although

The lack of management vision and the NOC’s existing processes are almost as big a barrier to change as are the lack of personnel resources and funding.

he said that he thought it would be difficult to combine

The Management Systems Manager stated that her

security and network operations into a single group. His

organization monitors network availability but does not

rationale for that statement was that security operations

monitor network performance. She added that her orga-

involves so much more than just networks.

nization would like to monitor performance but that “It is a resource issue. The only way we can monitor perfor-

Factors Inhibiting Change

mance is if we get more people.”

On a related issue,

Particularly within large organizations, change is difficult.

The Management Systems Manager said that due to

To better understand the resistance to change, we asked

relatively constant turnover in personnel, “Management

the Survey Respondents to indicate what factors would

vision changes every couple of years. Some managers

inhibit their organization from improving their NOC. Their

have been open to monitoring performance while others

responses are shown in Figure 8.7.

have not believed in the importance of managing network performance.”

The Next-Generation NOC Given the diversity of how IT organizations currently deploy a NOC it is highly unlikely that there is a single description of a next generation NOC that would apply to all organizations. While it is not possible to completely characterize the next generation NOC, there are some clear trends in terms of how NOCs are evolving. One of these trends is the shift away from having NOC personnel sitting at screens all day waiting for green lights to turn yellow or red. In particular, over a quarter of the NOCFigure 8.7: Factors Inhibiting Change in the NOC

associated respondents indicated that their company had “eliminated or reduced the size of our NOC because we

It was not surprising that the two biggest factors inhibit-

have automated monitoring, problem detection and notifi-

ing change are the lack of personnel resources and the lack

cation.” It is important to note that in all likelihood a nota-

of funding. This is in line with the general trend whereby

bly higher percentage of organizations had implemented

IT budgets are increasing on average by only single digit

automated monitoring but had not eliminated or reduced

amounts and headcount is often being held flat. It is also

the size of their NOC.

not surprising that internal processes are listed as a major factor inhibiting change. The siloed NOC, the interest in ITIL

Another clear trend is the focus on the performance of both networks and applications. In many cases, this is


67


an example of how tasks that used to be performed by

As will be explained in this section, how these steps

tier 2 and 3 personnel are now being performed by NOC

apply to a traditional network management task, such as

personnel.

fault management, is significantly different from how they

Given where NOC personnel spend their time, the NOC should be renamed the Application Operations Center (AOC). Other trends are less clear. For example, there is wide

apply to managing application performance. The IT professional whose interviews are used in this section of the handbook described a range of approaches to MTTR.

admission that current NOC processes are often ineffec-

• The Manufacturing Specialist: The organization

tive. There is also wide spread interest in using ITIL to

does not officially measure MTTR, but only esti-

develop more effective processes. However, it is unlikely

mates it.

that the typical NOC will significantly improve its NOC processes in the next 12 to 18 months. Another somewhat fuzzy trend is the integration of network and security operations. There is no doubt that there will be growing interaction between these groups. However, within the next two years only a minority of IT organizations will fully integrate these two groups.

Rethinking MTTR For Application Delivery The Changing Concept of MTTR Mean Time To Repair (MTTR) —the mean or average time that it takes the IT organization to repair a problem— is a critical metric for measuring performance. The understanding of MTTR, however, is changing because as explained in the preceding section, the responsibility of the network organization is expanding to include the ongoing management of application performance. This section describes those changes and their impact on network management. The basic, three-step process for troubleshooting is not changing: • Problem Identification • Problem Diagnosis

• The Telecommunications Manager: The organization pays a lot of attention to MTTR, but that it applies only to the availability, not performance, of the infrastructure and the applications. • The Education Director: The organization does measure MTTR, but that they do it only for fault management and not for application performance. He also stated that currently the MTTR is around 3 or 4 hours, but that his management is getting more demanding and wants him to reduce the MTTR. • The Financial Engineer: The organization measures MTTR for both availability and application performance, and they compute separate MTTR metrics for different priorities of problems. For example, the highest priority is a problem that causes a large number of users to not be able to access the information or applications that they need to do their jobs. The next highest priority is a problem that results in a large number of users being able to access the information and applications they need, but in a degraded fashion. There is a very wide range of approaches relative to how IT organizations approach MTTR.

• Solution Selection and Repair


68


MTTR and Application Performance Management The Manufacturing Specialist commented that within his organization, managing application performance is a shared responsibility. He noted, however, that because of the success they have had with the management tools that they have deployed, that other organizations call them seeking help with resolving a problem. The Manufacturing Specialist also pointed out that while adding the capability of managing application performance is important, that “you cannot lose track of existing issues.” He added that fault management is still important and that they “are still tracking IOS bugs.” The Telecommunications Manager stated that they have begun to measure application degradation. He added that application degradation is taken more seriously inside of his company if he can quantify how much revenue was lost as a result of the degradation. The Telecommunications Manager also stated that his organization has begun to become ISO9001 certified. He expects that their level of certification will increase as the demands for better application performance increases.

management organization attempts to identify and resolve problems before they impact end users. In a reactive approach, network management organizations respond to the fault once end users have been impacted. With fault management, it’s relatively easy to identify that a fault exists, since they often lead to a readily-noticeable outage. As well, it is fairly easy to set alarms indicating the failure of a component. By contrast, identifying application degradation is much more difficult. For example, as previously noted most IT organizations do not have objectives for the performance of even their key, business-critical applications, and few monitor the end-to-end performance of their applications. As a result, the issue of whether or not an application has degraded is often highly subjective. The Financial Engineer stated that when a user calls in and complains about the performance of an application that a trouble ticket is opened. He also stated that, “The [MTTR] clock starts ticking when the ticket is opened and keeps ticking until the problem is resolved.” In his organizations there are a couple of meanings of the phrase “the problem is resolved.” One of them is that the user is no longer affected. Another meaning is that the source of

The Manufacturing Specialist reinforced the importance

the problem has been determined to be an issue with the

of effective processes. He stated that in order to get better

application. In these cases, the trouble ticket is closed and

at managing the performance of applications, that his orga-

they open what they refer to as a bug ticket.

nization has implemented ITIL-based processes. He stated that these processes are “a huge part of our success” and that the processes force you to “understand what you are doing and why you are doing it.”

The Financial Engineer added that in some cases, “The MTTR can get pretty large.” He added that roughly 60% of application performance issues take more than a day to resolve. In those cases where the MTTR is getting

Improving processes (including training and the

large, his organization forms a group that they refer to as a

development of cross-domain responsibility)

Critical Action Team (CAT). The CAT is comprised of tech-

is an important part of improving the MTTR of

nical leads from multiple disciplines that come together to

application performance.

resolve the difficult technical problems.

Problem Identification

Problem Diagnosis

Like every component of network management, applica-

In the case of fault management, the focus of diagnosis

tion performance management can either be done proac-

is to determine which component of the infrastructure is

tively or reactively. In a proactive approach, the network

not working. Part of the difficulty of diagnosing the cause


69


of an outage is that a single fault can cause a firestorm of

The network is usually not the source of appli-

alarms. Although one should not understate the difficulty

cation performance degradation, although that

of filtering out extraneous alarms to find the defective

is still the default assumption within many IT

component, it is easier to identify the component of the

organizations.

infrastructure that is not functioning than it is to identify the factor that is causing an application to perform badly. One of the reasons that it is so difficult to diagnose the cause of application degradation is that as discussed in Chapter 7, each and every component of IT could cause the application to perform badly. This includes the network, the servers, the database and the application itself. This means that unlike fault management, which tends to

Reducing the Time to Diagnose Given that it can take a long time to diagnose an application performance problem, we asked the survey respondents to indicate what was the average length of time it took to diagnose performance problems before and after purchasing their chosen application performance management solution. Their responses are shown in Table 8.3.

Less than 1 hour

More than 1 hour, but less than 3 hours

More than 3 hours, but let than 5

More than 5 hours, but less than 8

More than 8 hours, but less than 24

More than 24 hours

Before Solution

2.8%

24.3%

20.8%

19.4%

14.6%

18.1%

After Solution

32.8%

44.3%

12.6%

5.2%

1.1%

4.0%

Table 8.3: Length of time to Diagnose Problem

focus on one technology and on one organization, diagnos-

The results displayed in table 8.3 are dramatic. For

ing the cause of application degradation crosses multiple

example, before deploying the solution it was rare that

technology and organizational boundaries. In general, most

a performance problem was diagnosed in less than 1

IT organizations find it difficult to solve problems that cross

hour. After deploying the solution, almost a third of all

multiple technology and organizational boundaries.

performance problems are diagnosed in less than 1 hour.

The Financial Engineer stated that in his last company, 90% of issues were originally identified as being network issues even though the reality was that only 10% of issues actually were network-related. He pointed out that incor-

Analogously, before deploying the solution, almost a third of all performance problems took more than 8 hours to diagnose. After deploying the solution, only five percent take that long.

rectly assuming that the majority of issues are network

The Manufacturing Specialist noted that 10% of issues

related has the affect of increasing the amount of time

can take longer than a day to resolve and that some of

it takes to accurately diagnose the problem. He recom-

them can go unresolved for months. He said that one

mended that when a problem is called into the help desk

problem they had recently involved the MAPI (Messaging

that the person calling in should be encouraged to describe

Application Programming Interface) protocol. As it turns

the symptoms of the problem in detail, and not just what

out, every 32 milliseconds the protocol would retransmit

they think the source of the problem is.

volumes of information. This took a long time to identify. He added that, “When something is fundamentally wrong, it can take a long time to identify and fix.”


70


Choosing and implementing the proper application performance management solution can greatly reduce MTTR and improve cooperation between different IT teams.

Demonstrating the Value of Performance Management Kubernan research indicates that 75% of IT organizations need to cost-justify an investment in performance management. There is also some anecdotal evidence that

Solution Selection and Repair

the way that IT organizations perform this cost justifica-

In the case of fault management, there typically is no

tion is changing. For example, The IT Services Director

solution selection step. In particular, once it has been

commented that at one time his organization could justify

determined which component has failed, the solution is

an investment in management tools just based on their

obvious: fix that component.

intuition that the tool would pay for itself. That is no longer

The situation is entirely different with managing application performance because the component of IT that is causing the application to degrade may not be the component that gets fixed or replaced. For example, as was previously described sometimes the way the application was written will cause the application to perform badly. However, rewriting the application may not be an option, particularly if the application was acquired from a 3rd party. In that case, the IT organization will have to implement a work-around to compensate for how the application was written. In an analogous fashion the repair component of fault management differs somewhat from the repair component of application management. In the case of fault man-

the case. In fact, his organization now has a very formal process for evaluating return on investment (ROI). The IT Director added that the depth of the analysis management expects depends in part on the cost and scope of the project. He pointed out that as part of the analysis they typically have to answer questions such as, “If you buy this tool, what tool or tools well be retired?” “What is the decommissioning cost?” “What staff costs and productivity enhancements can be anticipated?” “What are the related maintenance and training costs?” The vast majority of IT organizations must cost-justify an investment in performance management.

agement, once you replace the defective part you fully

Over a third of the IT organizations that are required to

expect the problem to be fixed. In the case of managing

cost-justify an investment in performance must perform

application performance, once you implement the chosen

the cost-justification both before and after implementing

solution, you are less sure that the problem will go away.

the solution. The IT Services Director commented that his

As a result, in some instances the IT organization has to

organization has to cost-justify a solution prior to purchas-

repeat the problem diagnosis as well as the solution selec-

ing it and then go back after implementation and quantify

tion and repair processes.

the actual impact of the solution. He stated that this pro-

Reducing MTTR requires both credible tools and an awareness of and attention to technical and non-technical factors. In many instances it

cess has the affect of increasing the credibility of the IT organization. Cost-justifying an investment in performance manage-

can be as much a political process as a techno-

ment can be a complex task.

logical one.

respondents were asked to indicate which techniques they

In a recent survey, the

had used to justify an investment in performance management. Their responses are shown in Figure 8.8.


71


Figure 8.8: Techniques to Cost-Justify Performance Management

As shown in Figure 8.8, there are myriad techniques are

Another common metric is the classic rate of return on

used to cost-justify an investment in performance man-

the investment 22. For example, assume that an IT organi-

agement. Some of these techniques focus on hard sav-

zation invests $120,000 in a new performance manage-

ings while other focus on somewhat softer savings.

ment solution and that this results in a monthly savings of

No one approach to cost-justifying performance management works in all environments nor all the time in a single environment.

ROI Based on Hard Savings In most cases hard savings—a reduction in the money that will leave the company as a result of an investment— is the easiest way to get management approval for any IT investment.

The typical ROI analysis involves an IT

investment that will result in a monthly savings, in which case the usual financial metric is the payback period: the amount of time before the initial investment is recovered. For example, if a one hundred thousand dollar investment in IT results in a monthly savings of ten thousand dollars, the payback period is ten months.

$10,000 for a period of two years. The investment has a payback period of one year, and after two years there is a total savings of $240,000, which is equivalent to a 42% rate of return.

The Cost of Downtime Another way to demonstrate the value of performance management involves the cost of downtime. For instance, the author used to work for Digital Equipment Corporation (DEC). When he was at DEC, if communications with one of DEC’s just-in-time manufacturing plants was lost, it was widely accepted that the cost to DEC was roughly one million dollars an hour of revenue. As a result, it was often possible to justify making IT investments in order to mini22 Computing the RoI of an IT Investment, Jim Metzler,

http://www.webtorials.com/main/resource/papers/metzler/ pres1/index.htm


72


mize the probability of losing communications with any of

how much of a productivity loss senior managers actually

those plants. In contrast, while losing communications to

ascribe to a one hour outage.

one of DEC’s administrative building was considered a major inconvenience, it was not seen as a situation that resulted in the company losing revenue. As a result, it was much more difficult to justify an IT investment to minimize the probability of losing communications with any of DEC’s administrative buildings.

In the preceding examples, the term downtime literally meant the application was not available. It is also possible to demonstrate the value of performance management based on the degradation of an application. For example, consider a company that uses the SD (Sales and Distribution) component of SAP for sales order entry. If the

The discussion of DEC highlights the fact that in order

SD component runs slowly, the company will see an impact

to use the cost of downtime to cost-justify acquiring a

on both productivity and in lost revenues. Productivity

performance management solution, there has to be a

declines because the company’s sales organization has to

widely-accepted cost of downtime for at least some com-

wait for the SD component to respond. Revenues decline

ponent of the IT infrastructure. One company that has a

when customers get irritated enough to take their busi-

widely-accepted cost of downtime is a US based casino

ness elsewhere. A performance management solution that

that can not be mentioned by name in this handbook. Not

can quickly identify the cause of such problems should be

too long ago, the casino had an intermittent LAN problem

relatively easy for that company to cost-justify.

that would take parts of their slot machine floor off line to repair. Before they implemented a performance management solution, it took four hours to find the source of the problem, but after the solution was implemented it took only one hour. According to the then network manager at the casino, “Reducing a four-hour down time to one-hour is worth a minimum of $400K to us.” The cost of downtime varies widely between companies and can also vary widely within a given company. Another way to demonstrate the value of performance management is by showing that reducing downtime not only protects a company’s revenue stream, it also protects the productivity of employees. For example, assume that the one thousand employees in the customer service organization at a hypothetical company have an average loaded salary of $50/ hour. If the jobs of these employees required that they constantly access applications, it could be argued that an hour’s outage costs the company $50,000 in lost productivity each time there is an outage.

Control Introduction To effectively control both how applications perform, as well as who has access to which applications, IT organizations must be able to: • Affect the routing of traffic through the network. • Enforce company policy relative to what devices can access the network. • Classify traffic based on myriad criteria. • Prioritize traffic that is business critical and delay sensitive. • Perform traffic management and dynamically allocate network resources. • Identify and control the traffic that enters the IT environment over the WAN. • Provide virtualized instances of key IT resources.

The success of this argument depends in large part on


73


Route Control

4. Reporting

One challenge facing IT organizations that run businesscritical applications on IP networks is the variability in the performance of those networks. In particular, during the course of a day, both private and public IP networks exhibit a wide range of delay, packet loss and availability 23. Another challenge is the way that routing protocols choose a path through a network. In particular, routing protocols choose the least-cost path through a network. The least-cost path through a network is often computed to be the path with the least number of hops between the transmitting and receiving devices. Some sophisticated routing protocols, such as OSPF, allow network administrators to assign cost metrics so that some paths through the network are given preference over others. However it

Report on the performance of each path as well as the overall route optimization process.

SSL VPN Gateways

The SSL protocol24 is becoming increasingly popular

as a means of providing secure Web-based communications to a variety of users including an organization’s mobile employees. Unlike IPSec which functions at the network layer, SSL functions at the application layer and uses encryption and authentication as a means of enabling secure communications between two devices, which typically are a web browser on the user’s PC or laptop and an SSL VPN gateway that is deployed in a data center location.

is computed, the least-cost path through a network does

SSL provides flexibility in allowing enterprises to define the

not necessarily correspond to the path that enables the

level of security that best meets their needs. Configuration

optimum performance of the company’s applications.

choices include:

A few years ago, organizations began to deploy functionality referred to as route control. The goal of route control is to make more intelligent decisions relative to how traffic is routed through an IP network. Route control achieves this goal by implementing a four-step process.

Those

steps are: 1. Measurement Measure the performance (i.e., availability, delay, packet loss, and jitter) of each path through the network. 2. Analysis and Decision Making

• Encryption: 40-bit or 128-bit RC4 encryption • Authentication: Username and password (such as RADIUS), username and token + pin (such as RSA SecurID), or X.509 digital certificates (such as Entrust or VeriSign) All common browsers such as Internet Explorer include SSL support by default, but not all applications do. This necessitates either upgrading existing systems to support SSL or deploying an SSL VPN gateway in the data center. One of the purposes of an SSL VPN gateway is to communicate directly with both the user’s browser and the

Use the performance measurements to determine

target applications and enable communications between

the best path. This analysis has to occur in real

the two. Another purpose of the SSL VPN gateway is to

time.

control both access and actions based on the user and the

3. Automatic Route Updates

endpoint device.

Once the decision has been made to change paths, update the routers to reflect the change. 23 Assessment of VoIP Quality over Internet Backbones, IEEE

INFOCOM, 2002

24 IPSec vs. SSL: Why Choose?, http://www.securitytechnet.com/

resource/rsc-center/vendor-wp/openreach/IPSec_vs_SSL.pdf


74


Some of the criteria that IT organizations should use

significant control challenges for the network operations

when choosing an SSL VPN gateway include that the

group. For example, network operations groups are now

gateway should be:

challenged to ensure that the virtual resources are utilized

• Easy to deploy, administer and use • Low cost over the lifecycle of the product • Transparent • Capable of supporting non-traditional devices; e.g., smartphones and PDAs • Able to check the client’s security configuration • Able to provide access to both data and the appropriate applications • Highly scalable • Capable of supporting granular authorization policies • Able to support performance enhancing functionality such as caching and compression • Capable of providing sophisticated reporting

Virtualization Until recently, data centers were designed in such a way that they provided isolated silos of resources. That means that computing and storage resources were available only to the application that they were deployed to support. While this design was easy to support from the perspective of being able to control the IT resources, it lead to significant under-utilization of the resources. In particular, in a legacy data center, server utilization is typically between 10 and 30 percent and storage utilization is often less than 20 percent. Increasing the utilization of computing and storage resources can dramatically reduce the TCO associated with a data center. To achieve this goal, many IT organizations are adopting a data center design that is based on providing virtual server and storage resources that are available to all applications. This design, however, creates some

enough that the TCO is significantly reduced. However, the network operations groups need to be able to control the virtual resources to ensure that these resources are not utilized so much that performance is impacted. Another form of virtualization that is used to control IT resources is application virtualization – both client side and server side. Client-side application virtualization enables applications to managed in a centralized application hub, but streamed to the user’s machine and run in a controlled, isolated environment. As a result, applications become an on-demand service. Technology such as caching is often employed to increase the availability of the application. In a server-side application virtualization environment, the user interface (logical layer) is abstracted from the application processing (physical layer) that occurs on a centralized, secured server. This technology is often used for delivering client/server applications because it mitigates some of the complexities of deploying, managing, updating and securing client software on each individual user’s access device. Instead, a single instance of the client application is installed on one or more servers within the data center. The application executes entirely on the server while its interface is displayed on the users device. Recently there has been a similar movement to virtualize the desktop. As defined in Wikipedia25, desktop virtualization involves separating the physical location where the PC desktop resides from where the user is accessing the PC. A remotely accessed PC is typically either located at home, at the office or in a data center. The user is located elsewhere, perhaps traveling, in a hotel room, at an airport or in a different city. The desktop virtualization approach can be contrasted with a traditional local PC desktop, where the user directly accesses the desktop operating 25 http://en.wikipedia.org/wiki/Desktop_Virtualization


75


system and all of its peripherals physically using the local

Quality of Service (QoS) functionality, control bandwidth

keyboard, mouse and video monitor hardware directly.

leaving the network, but do not address traffic coming

When a desktop is virtualized, its keyboard, mouse and video display (among other things) are typically redirected across a network via a desktop protocol such as RDP, ICA, VNC, etc. The network connection carrying this virtualized desktop information is known as a “desktop access session.” One of the more common ways to virtualize desktops is through a model that is typically referred to as the shared desktop model. In this model, a multi-user server

into the network, where the bottleneck usually occurs. Technologies like TCP Rate Control tell the remote servers how fast they can send content providing true bi-directional management. Some of the key steps in a traffic management process include: Discovering the Application

PC environment is used to host many users who all share

Application discovery must occur at Layer 7. In

a common PC desktop environment together on a server

particular, information gathered at Layer 4 or lower

machine.

allows a network manager to assign a lower prior-

Traffic Management and QoS

ity to their Web traffic than to other WAN traffic. Without information gathered at Layer 7, however,

Traffic Management refers to the ability of the network

network managers are not able manage the com-

to provide preferential treatment to certain classes of traf-

pany’s application to the degree that allows them

fic. It is required in those situations in which bandwidth

to assign a higher priority to some Web traffic than

is scarce, and there are one or more delay-sensitive,

to other Web traffic.

business-critical applications. Two examples of this type of application that have been discussed previously in this

Profiling the Application

handbook are VoIP and the Sales and Distribution (SD)

Once the application has been discovered, it is

module of SAP.

necessary to determine the key characteristics of

The focus of the organization’s traffic management processes must be the company’s applications, and not solely the megabytes of traffic traversing the network.

that application. Quantifying the Impact of the Application Since many applications share the same WAN physical or virtual circuit, these applications will

To ensure that an application receives the required

tend to interfere with each other. In this step of

amount of bandwidth, or alternatively does not receive too

the process, the degree to which a given applica-

much bandwidth, the traffic management solution must

tion interferes with other applications is identified.

have application awareness. This often means detailed Layer 7 knowledge of the application, because as dis-

Assigning Appropriate Bandwidth

cussed in Chapter 7 many applications share the same

Once the organization has determined the band-

port, or even hop between ports.

width requirements and identifies the degree to

Another important factor in traffic management is the ability to effectively control inbound and outbound traffic. Queuing mechanisms, which form the basis of traditional

which a given application interferes with other applications, it may now assign bandwidth to an application. In some cases, it will do this to ensure that the application performs well. In other cases,


76


it will do this primarily to ensure that the applica-

classes is referred to as real time and is intended for appli-

tion does not interfere with the performance of

cations such as voice and video. The other is called best

other applications.

effort and is intended for any traffic that is not placed in the

Due to the dynamic nature

of the network and application environment, it is highly desirous to have the bandwidth assignment be performed dynamically in real time as opposed to using pre-assigned static metrics. In some solutions, it is possible to assign bandwidth relative to a specific application such as SAP. For example, the IT organization might decide to allocate 256 Kbps for SAP traffic. In some other solutions, it is possible to assign bandwidth to a given session. For example, the IT organization could decide to allocate 50 Kbps to each SAP session. The advantage of the later approach is that it frees the IT organization from having to know how many simultaneous sessions will take place.

real-time service class. The CoS profile refers to how the capacity of the service is distributed over these two service classes.

In most

cases, if all of the traffic were assigned to the real-time traffic class that would cost more than a 50-50 split in which half the traffic were assigned to real time and half to best effort. Analogously, a 50-50 split would cost more than if all of the traffic was assigned to best effort. Hence, assigning more capacity to the real-time class than is necessary will increase the cost of the service. However, since most carriers drop any traffic that exceeds what the real-time traffic class was configured to support, assigning less capacity to this class than is needed will likely result in poor voice quality. Some solutions provide QoS mecha-

Many IT organizations implement QoS via queuing

nisms to independently prioritize packets based on traffic

functionality found in their routers. Implementing QoS

class and latency sensitivity, thereby avoiding the problem

based on aggregate queues and class of service is often

of over-provisioning WAN bandwidth in an effort to ensure

sufficient to prioritize applications. However, when those

high performance.

queues get oversubscribed (e.g. voice), degradation can occur across all connections. As a result, an “access control’ or “per call” QOS is sometimes required to establish acceptable quality. Other IT organizations implement QoS by deploying MPLS based services. The use of MPLS services is one more factor driving the need for IT organizations to understand the applications that transit the network. This is the case because virtually all carriers have a pricing structure

Status of Quality of Service Implementation This section of the Handbook identifies the current and planned deployment of QoS, the decision-making process involved, and the issues that must be considered in the implementation and ongoing auditing and managing of QoS functionality.

Motivation for Deploying QoS

for MPLS services that includes a cost for the access

Survey Respondents were asked to indicate whether or

circuit and the port speed. Most carriers also charge for

not they have implemented a QoS policy to prioritize traf-

a variety of advanced services, including network based

fic. Their responses are contained in Table 9.1.

firewalls and IP multicast. In addition, carriers often charge for the CoS (class of service) profile. While most carriers offer between five and eight service classes, for simplicity assume that a carrier only offers two service classes. One of the service

Response Yes, we’ve implemented QoS

Percentage 61.9%

No, but we’re planning to within the next 12 months

21.2%

No, and have no plans to implement it at this time

14.4%

Don’t know

2.5%

Table 9.1: QoS Deployment


77


The percentage of IT organizations that either have imple-

bility of using QoS as part of their problem-solving portfo-

mented QoS or plan to do so within 12 months is just over

lio when confronting an issue such as a poorly-performing

83%. It is, of course, unlikely, that all of those planning

application or a congested link. They are also starting to

to implement actually will within the indicated timeframe,

use QoS to support their ongoing deployment of VoIP.

due to anything from a budget or headcount freeze to a

Because of that, they are in the process of extending QoS

change in priorities. Nonetheless, it would seem likely that

functionality to their LAN routers.

somewhere on the order of 70% of IT organizations will have deployed some form of QoS within the next year. The majority of IT organizations have already

While VoIP is a major driver of QoS deployment, so are other latency-sensitive applications such as Citrix Presentation Server.

deployed QoS, and at least 70% will have done

The Mechanics of QoS Deployments

so within the next 12 months. Survey Respondents were also asked to indicate the primary reason why they had implemented QoS. Their

Percentage

To support VoIP

35.8%

To support latency sensitive applications other than VoIP

14.0%

As part of a WAN services offering; e.g., MPLS

14.0%

To support VoIP/other latency sensitive applications on an MPLS WAN

17.7%

In response to performance issues we were experiencing

15.4%

Don’t know

0.9%

Other

2.3%

ity. Their responses are contained in Table 9.3. Decision Process Network team made the decision

Percentage 67.4%

Discussed with the application group

47.5%

Discussed with business managers

43.9%

Recommendation from third party

14.9%

Don’t know

4.1%

Other

6.8%

Table 9.3: Decision Process for Prioritization

Although the primary responsibility for setting QoS

Table 9.2: Reasons for Implementing QoS

priorities rests with the network groups, in most IT orga-

The Medical Manager stated that his organization had deployed QoS on all of their WAN routers and he pointed out that since they have not implemented VoIP, they have not deployed QoS on their LAN routers.

Survey Respondents were asked to indicate how they either did or will decide which applications are given prior-

responses are contained in Table 9.2. Reason

Establishing Priorities

The Medical

Manager added that they deployed QoS primarily to support Citrix-based applications running over an MPLS network. The Pharmaceutical Consultant stated that to date his organization has deployed QoS on all of their roughly 250 WAN routers. He stated that initially their deployment of QoS was driven by the need to support a variety of Citrix Presentation Server-based applications as well as their use of video over IP. He added, however, that they now also use QoS on a tactical basis, that is, they include the possi-

nizations, the decision is discussed with other affected groups. The Survey Respondents were also asked to indicate the degree of granularity of their QoS policy. Their responses, which are depicted in Figure 9.1, show the percentage of respondents who have implemented the indicated number of QoS categories. The Medical Manager stated that they decided which applications were given priority after discussions with the application group and business managers. His organization implemented four classes of service, and placed Citrix Presentation Server, SQL and some network management


78


to determine the traffic characteristics.

8 or more

They use this information in order to intelligently assign QoS categories.

Categories

7

Both The Medical Manager and The

6

Pharmaceutical Consultant stated that they baselined their network prior

5

to implementing QoS.

4

The Medical

Manager stated that this provided his

3

organization with information that they

2

the application group and business man-

0.0%

used in the discussions they had with

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%


in the best effort service class. The Pharmaceutical Consultant indicated that in addition to discussing application prioritization with the application group, his organization also used recommendations from a third party as part of their deployment of QoS. In particular, he stated that they followed the Cisco guidelines for

The

the fact that they baselined their net-

traffic in the highest service class. They placed internal traffic in the next service class, and finally external email

of traffic should have priority.

Pharmaceutical Consultant stated that

Figure 9.1: Granularity of QoS Implementations

email in the next highest service class, followed by Web

agers relative to identifying which types

work prior to implementing QoS enabled them to tweak their initial plans and have a more successful deployment. While important, few IT organizations establish response time objectives for priority applications prior to deployment. The Medical Manager stated that his organization did not do this and The Pharmaceutical Consultant stated that his organization did, but just for video.

QoS deployment. Those guidelines specify eight classes

Both The Medical Manager and The Pharmaceutical

of service and indicate which traffic types are assigned

Consultant stated that their organization performs ongo-

to each class. Some of the recommended traffic types

ing monitoring of their QoS implementation. The Medical

are fairly obvious, such as voice, video and traffic that

Manager stated that they regularly check router reports to

is both mission-critical and transactional.

Some of the

make sure that their QoS policy is being enforced properly

traffic types, however, are somewhat less obvious. This

and that the Citrix Presentation Server traffic is receiving

includes network management traffic, as well as routing

the network resources that it needs. The Pharmaceutical

information.

Consultant said that for selected applications they pro-

Most organizations use four or fewer classes of service.

Planning and On-Going Management One of the key tasks that many IT organizations perform

vide ongoing monitoring of the application’s response time. They do this in part to monitor their QoS deployment. They also do this to identify other issues, such as a poorly performing server, that can impact application performance.

prior to implementing QoS is to baseline the network


79


Baselining of pre-QoS performance, and ongoing

cessing power of current generation firewalls prevents deep

monitoring of QoS implementations, are impor-

packet inspection from being applied to more than a small

tant tools for guaranteeing success.

minority of the packets traversing the device.

Next Generation WAN Firewall Current Generation Firewalls The first generation of firewalls was referred to as packet filters. These devices functioned by inspecting packets to see if the packet matched the packet filter’s set of rules. Packet filters acted on each individual packet (i.e., 5-tuple consisting of the source and destination addresses, the protocol and the port numbers) and did not pay any attention to whether or not a packet was part of an existing stream or flow of traffic. Today most firewalls are based on stateful inspection. According to Wikipedia26, “A stateful firewall is able to hold in memory significant attributes of each connection, from start to finish. These attributes, which are collectively known as the state of the connection, may include such details as the IP addresses and ports involved in the connection and the sequence numbers of the packets traversing the connection. The most CPU intensive checking is performed at the time of setup of the connection. All packets after that (for that session) are processed rapidly because it is simple and fast to determine whether it belongs to an existing, pre-screened session. Once the session has ended, its entry in the state-table is discarded.” One reason that traditional firewalls focus on the packet header is that firewall platforms generally have limited processing capacity due to architectures that are based on software that runs on an industry standard CPU. A recent enhancement of the current generation firewall has been the addition of some limited forms of application level attack protection. For example, some current generation firewalls have been augmented with IPS/IDS functionality that uses deep packet inspection to screen suspicious-looking traffic for attack signatures or viruses. However, limitations in pro-

The Use of Well-Known Ports, Registered Ports, and Dynamic Ports Chapter 7 pointed out that the ports that are numbered from 0 to 1023 are reserved for privileged system-level services and are designated as well-known ports. As a reminder, a well-known port serves as a contact point for a client to access a particular service over the network. For example, port 80 is the well-known port for HTTP data exchange and port 443 is the well-known port for secure HTTP exchanges via HTTPS. Port numbers in the range 1024 to 49151 are reserved for Registered Ports that are statically assigned to user-level applications and processes. For example, SIP uses ports 5059-5061. A number of applications do not use static port assignments, but select a port dynamically as part of the session initiation process. Port numbers between 49152 and 65535 are reserved for Dynamic Ports, which are sometimes referred to as Private Ports. One of the primary reasons that stateful inspection was added to traditional firewalls was to track the sessions of whitelist applications that use dynamic ports. The firewall observes the dynamically selected port number, opens the required port at the beginning of the session, and then closes the port at the end of the session. Most current generation firewalls make two fundamental assumptions, both of which are flawed. The first assumption is that the information contained in the first packet in a connection is sufficient to identify the application and the functions being performed by the application. In many cases, it takes a number of packets to make this identification because the application end points can negotiate a change in port number or perform a range of functions over a single connection.

26 http://en.wikipedia.org/wiki/Stateful_firewall


80


The second assumption is that the TCP and UDP well-

think that you are in a secure environment. However, at

known and registered port numbers are always used as

the end of the day a lot of applications that were declared

specified by IANA.

as outlaws are still running on your network.”

Unfortunately, while that may well

have been the case twenty years ago that is often not the case today. As pointed out in Chapter 7, some applications have been designed with the ability to hop between ports. In a recent survey, 82% of the survey respondents indicated that they were concerned about the fact that traditional firewalls focus on well known ports and hence are not able to distinguish the various types of traffic that transit port 80.

Asked about the limitation of traditional firewalls, The Senior Director said that traditional firewalls do not provide any application layer filtering so if you are attacked above Layer 3 “you are toast”. The Global Architect said that in theory an IT organization could mitigate the limitations of a traditional firewall by implementing a traditional firewall combined with other security related functionality such as an IPS. However, he stressed that this is only a theory

Another blind spot of current generation firewalls is for

because the IT organization would never have enough

HTTP traffic that is secured with SSL (HTTPS). HTTPS

knowledge of the applications to make this work. This

is normally assigned to well-known TCP port 443. Since

point was picked up on by The Senior Director who stated

the payload of these packets is encrypted with SSL, the

that his organization had been looking at adding other

traditional firewall cannot use deep packet inspection to

security functionality such as IDS, IPS and NAC. What he

determine if the traffic either poses a threat or violates

wanted, however, was to avoid the complexity of having a

enterprise policies for network usage. These two blind

large number of security appliances. He preferred to have

spots are growing in importance because they are being

a “firewall on steroids” provide all this functionality.

exploited with increasing frequency by application-based intrusions and policy violations.

A Next Generation Firewall The comments of The Global Architect and The Senior

Enterprise Requirements

Director serve to underscore some of the unnatural net-

The Global Architect stated that they found themselves

working that has occurred over the last decade. In par-

in the situation where they had a security policy and no

ticular, firewalls are typically placed at a point where all

ability to enforce it as they could not be sure of what

WAN access for a given site coalesces. This is the logical

applications were being used. In particular, they house

place for a policy and security control point for the WAN.

their Web servers inside their DMZ so they can control the

Unfortunately due to the lack of a ‘firewall on steroids’ that

applications that run on those servers. As such, they were

could provide the necessary security functionality, IT orga-

not concerned about managing the inbound connection to

nizations have resorted to implementing myriad firewall

their Web servers. In contrast, they do not know what

helpers27.

applications are running on their other servers and so they do not know what outbound traffic is being generated. Part of the concern of The Global Architect is that if they were running programs such as BitTorrent they were vulnerable to being charged with breaking copyright laws. In addition, if they were supporting recreational applications such as Internet Radio, they were wasting a lot of WAN bandwidth. He summed up his feelings by saying, “You

It is understandable that IT organizations have deployed work-arounds to attempt to make up for the limitations of traditional firewalls. This approach, however, has serious limitations including the fact that the firewall helpers often do not see all of the traffic and the deployment of multiple 27 Now Might Be a Good Time to Fire Your Firewall,

http://ziffdavisitlink.leveragesoftware.com/blog_post_view.aspx ?BlogPostID=603398f2b87548ef9d51d35744dcdda4


81


security appliances significantly drives up the operational

more validity at a time when applications were monolithic

costs and complexity.

in design and before the Internet made a wide variety of

In order for the firewall to avoid these limitations and reestablish itself as the logical policy and security control point for the WAN, what is needed is a next generation firewall with the following attributes:

Application Identification The firewall must be able use deep packet inspection to look beyond the IP header 5-tuple into the payload of the packet to find application identifiers. Since there is no

applications available. Today’s reality is that an application that might be bad for one organization might well be good for another. On an even more granular level, an application that might be bad for one part of an organization might be good for other parts of the organization. Also, given today’s complex applications, a component of an application might be bad for one part of an organization but that same component might well be good for other parts of the organization.

standard way of identifying applications, there needs to

What is needed then is not a simple deny/allow model,

be an extensive library of application signatures developed

but a model that allows IT organizations to set granular lev-

that includes identifiers for all commonly used enterprise

els of control to allow the good aspects of an application to

applications, recreational applications, and Internet applica-

be accessed by the appropriate employees while blocking

tions. The library needs to be easily extensible to include

all access to the bad aspects of an application.

signatures of new applications and custom applications. Application identification will eliminate the port 80 blind spot and allow the tracking of port-hopping applications.

Extended Stateful Inspection

Multi-gigabit Throughput In order to be deployed in-line as an internal firewall on the LAN or as an Internet firewall for high speed access lines, the next generation firewall will need to perform

By tracking application sessions beyond the point where

the above functions at multi-gigabit speeds. These high

dynamic ports are selected, the firewall will have the ability

speeds will be needed to prevent early obsolescence

to support the detection of application-level anomalies that

as the LAN migrates to 10 GbE aggregation and core

signify intrusions or policy violations.

bandwidths, and as Internet access rates move to 1 Gbps

SSL Decryption/Re-encryption The firewall will need the ability to decrypt SSL-encrypted payloads to look for application identifiers/signatures. Once this inspection is performed and policies applied, allowed traffic would be re-encrypted before being forwarded to its destination. SSL proxy functionality, together with application identification, will eliminate the port 443 blind spot.

and beyond via Metro Ethernet. Application Identification and SSL processing at these speeds requires a firewall architecture that is based on special-purpose programmable hardware rather on than industry standard generalpurpose processors. Firewall programmability continues to grow in importance with the number of new vulnerabilities cataloged by CERT hovering in the vicinity of 8,000/year. When asked about the attributes that he expects in a next generation firewall, The Global Architect said that the

Control Traditional firewalls work on a simple deny/allow model.

ability to learn about applications on the fly was a require-

In this model, everyone can access an application that is

ment as was the need to run a multi-gigabit speed. Critical

deemed to be good.

Analogously, nobody can access

to The Global Architect is the ability to tie an event to a

an application that is deemed to be bad. This model had

user. To exemplify that he said, “If somebody is commu-


82


nicating using BitTorrent, my ability to tie that application

• Senior IT management needs to ensure that their

to a user is critical. I can do that with a traditional firewall,

organization evolves to where it looks at application

but it is a management nightmare.”

delivery holistically and not just as an increasing

The Senior Director agreed on the importance of application level visibility and high performance. He also stressed the importance of reporting and alerting when he said, “I need the ability to push security-related information from engineering to the help desk. The next generation firewall must not be so complicated that the average help desk analyst cannot input a rule set. It must also be simple enough for the people at the help desk to be able to use it to analyze what is going on.”

Conclusion For the foreseeable future, the importance of application delivery is much more likely to increase than it is to decrease. Analogously, for the foreseeable future the impact of the factors that make application delivery difficult, many of which sections 4 and 7 discussed, is much more likely to increase than it is to decrease. To deal with these two forces, IT organizations need to develop a systematic approach to applications delivery. Given the complexity associated with application delivery, this approach

number of stove-piped functions • Successful application delivery requires the integration of tools and processes. • Given the breadth and extent of the input from both IT organizations and leading edge vendors this handbook represents a broad consensus on a framework that IT organizations can use to improve application delivery. • In the vast majority of instances when a key business application is degrading, the end user, not the IT organization, first notices the degradation. • In situations in which the end user is typically the first to notice application degradation, IT ends up looking like bumbling idiots. • The current approach to managing application performance reduces the confidence that the company has in the IT organization. • A goal of this handbook is to help IT organizations

cannot focus on just one component of the task such as

develop the ability to minimize the occurrence of

network and application optimization. To be successful, IT

application performance issues and to both identify

organizations must implement an approach to application

and quickly resolve issues when they occur.

delivery that integrates the key components of planning, network and application optimization, management and control. This handbook identified a number of conclusions that IT

• Application delivery is more complex than just network and application acceleration. • Application delivery needs to have top-down

organizations can use when formulating their approaches

approach, with a focus on application performance.

to ensuring acceptable application delivery. Those conclu-

• Successful application delivery requires the integra-

sions are: • The complexity associated with application delivery will increase over the next few years. • If you work in IT, you either develop applications or you deliver applications.

tion of planning, network and application optimization, management and control. • Companies that want to be successful with application delivery must understand their current and emerging application environments.


83


• In the majority of cases, there is at most a moder-

they plan and holistically fund IT initiatives across

ate emphasis during the design and development of

all of the IT disciplines. Twelve percent (12%) of IT

an application on how well that application will run

organizations state that troubleshooting IT opera-

over a WAN.

tional issues occurs cooperatively across all IT

• A relatively small increase in network delay can result a very significant increase in application delay. • Successful application delivery requires IT organizations identify the applications running on the network and ensure the acceptable performance of the applications that are relevant to the business while controlling or eliminating irrelevant applications. • Every component of an application-delivery solution has to be able to support the company’s traffic patterns, whether they are one-to-many, many-to-many or some-to-many. • The webification of application introduces chatty protocols into the network. In addition, some or these protocols (i.e., XML) tend greatly increase the amount of data that transits the network and is processed by the servers. • While server consolidation produces many benefits, it can also produce some significant performance issues. • One effect of data-center consolidation and single hosting is additional WAN latency for remote users. • In the vast majority of situations, when people accesses an application they are accessing it over the WAN. • To be successful, application delivery solutions must function in a highly dynamic environment. This drives the need for both the dynamic setting of parameters and automation. • Only 14% of IT organizations claim to have aligned the application delivery with application develop-

disciplines. • People use the CYA approach to application delivery to show it is not their fault that the application is performing badly. In contrast, the goal of the CIO approach is to identify and then fix the problem. • Just as WAN performance impacts n-tier applications more than monolithic applications, WAN performance impacts Web services-based applications significantly more than WAN performance impacts n-tier applications. • Many IT professionals view the phrase Web 2.0 as either just marketing hype that is devoid of any meaning or they associate it exclusively with social networking sites such as MySpace. • Emerging application architectures (SOA, RIA, Web 2.0) have already begun to impact IT organizations and this impact will increase over the next year. • In addition to a services focus, Web 2.0 characteristics include featuring content that is dynamic, rich and in many cases, user created. • The existing generation of network and application optimization solutions does not deal with a key requirement of Web 2.0 applications – the need to massively scale server performance. • It is extremely difficult to make effective network and application-design decisions if the IT organization does not have well-understood and adhered-to targets for application performance. • Hope is not a strategy. Successful application delivery requires careful planning coupled with extensive

ment. Eight percent (8%) of IT organizations state


84


measurements and effective proactive and reactive processes. • The vast majority of IT organizations see significant value from a tool that can be used to test application performance throughout the application lifecyle. • IT organizations will not be regarded as successful if they do no have the capability to both develop applications that run well over the WAN and to also plan

• When choosing a network and application optimization solution it is important to ensure that the solution can scale to provide additional functionality over what is initially required. • Small amounts of packet loss can significantly reduce the maximum throughput of a single TCP session. • With a 1% packet loss and a round trip time of 50

for changes such as data center consolidation and

ms. or greater, the maximum throughput is roughly

the deployment of VoIP.

3 megabits per second no matter how large the

• In the vast majority of cases, a tool that is unduly complex is of no use to an IT organization. • The application-delivery function needs to be involved early in the applications development cycle. • A primary way to balance the requirements and capabilities of the application development and the application-delivery functions is to create an effective architecture that integrates those two functions. • IT organizations need to modify their baselining activities to focus directly on delay. • Organizations should baseline their network by measuring 100% of the actual traffic from real users. • To deploy the appropriate network and application optimization solution, IT organizations need to understand the problem they are trying to solve. • In order to understand the performance gains of any network and application-optimization solution, organizations must test that solution in an environment that closely reflects the environment in which it will be deployed. • IT organizations often start with a tactical deployment of WOCs and expand this deployment over time.

WAN link is. • The deployment of WAN Optimization Controllers will increase significantly. • When choosing a network and application optimization solution, organizations must ensure the solution can scale to provide additional functionality over what they initially require. • An AFE provides more sophisticated functionality than a SLB does. • IT organizations will not be successful with application delivery as long as long as the end user, and not the IT organization, first notices application degradation. • When an application experiences degradation, virtually any component of IT could be the source of the problem. • To be successful with application delivery, IT organizations need tools and processes that can identify the root cause of application degradation and which are accepted as valid by the entire IT organization. • Identifying the root cause of application degradation is significantly more difficult than identifying the root cause of a network outage.

• The deployment of WAN Optimization Controllers will increase significantly.


85


• Organizational discord and ineffective processes are

• NOC personnel spend an appreciable amount

at least as much of an impediment to the success-

of their time supporting a broad range of IT

ful management of application performance as are

functionality.

technology and tools. • Lack of visibility into the traffic that transits port 80 is a major vulnerability for IT organizations. • End-to-end visibility refers to the ability of the IT organization to examine every component of IT that impacts the communications once users hit ENTER or click the mouse to when they receive responses from an application. • To enable cross-functional collaboration, it must be possible to view all relevant management data from one place. • Application management should focus directly on the application and not just on factors that have the potential to influence application performance. • Most IT organizations ignore the majority of the performance alarms. • Logical factors are almost as frequent a source of application performance and availability issues as are device-specific factors. • In over a quarter of organizations, the NOC does not meet the organization’s current needs. • In the majority of cases, the NOC tends to work on a reactive basis identifying a problem only after it impacts end users. • Just under half of NOCs are organized around functional silos. • A majority of NOCs use many management tools that are not well integrated. • NOC personnel spend the greatest amount of time on applications, and that is a relatively new

• The NOC is almost as likely to monitor performance as it is to monitor availability. • There is a lot of interest in ITIL, but it is too soon to determine how impactful the use of ITIL will be. • In the vast majority of instances, the assumption is that the network is the source of application degradation. • In the majority of instances, the NOC gets involved in problem resolution. • The top driver of change in the NOC is the requirement to place greater emphasis on ensuring acceptable performance for key applications. • The lack of management vision and the NOC’s existing processes are almost as bit of a barrier to change as are the lack of personnel resources and funding. • Given where NOC personnel spend their time, the NOC should be renamed the Applications Operations Center. • There is a wide range of approaches relative to how IT organizations approach MTTR. • Improving processes (including training and the development of cross-domain responsibility) is an important part of improving the MTTR of application performance. • The network is usually not the source of application performance degradation, although that is still the default assumption within most IT organizations. • Choosing and implementing the proper application performance management solution can greatly

phenomenon.


86


reduce MTTR and improve cooperation between different IT teams. • Reducing MTTR requires both credible tools and an

Bibliography Articles by Jim Metzler

awareness of and attention to technical and non-

Newsletters written for Network World

technical factors. In many instances it can be as

The impact that small amounts of packet loss has on successful transmission

much a political process as a technological one. • The vast majority of IT organizations must cost-justify an investment in performance management. • No one approach to cost-justifying performance management works in all environments nor all the time in a single environment. • The cost of downtime varies widely between companies and can also vary widely within a given company. • The focus of the organization’s traffic management processes must be the company’s applications, and not merely the megabytes of traffic traversing the network. • The majority of IT organizations have already

http://www.networkworld.com/newsletters/frame/2007/1217wan2.html

Breaking the data replication bottleneck


MPLS vs. WAN optimization, Part 2


MPLS vs. WAN optimization, Part 1


Relationship between the network and apps development teams not bad but not great http://www.networkworld.com/newsletters/frame/2007/1001wan2.html

The AHA’s IT group benefits from working collaboratively http://www.networkworld.com/newsletters/frame/2007/0924wan1.html

How one network exec persuaded coworkers that the network is not always to blame http://www.networkworld.com/newsletters/frame/2007/0917wan2.html

How do you scan for what’s on port 80?

deployed QOS and at least 70% will have done so


within the next 12 months.

Would combining network and security operations reduce the negative impact of silos?

• While VoIP is a major driver of QoS deployments, so are other latency-sensitive applications such as Citrix Presentation Server. • Although the primary responsibility for setting QoS priorities rests with the network groups, in most IT organizations the decision is discussed with the other affected groups. • Most organizations use four or fewer classes of service. • Baselining of pre-QoS performance and ongoing monitoring of QoS implementations, are important tools for guaranteeing success.


The addition of new technologies means IT becomes increasingly siloed


Ignore the port 80 black hole at your peril


The schism between the applications team and the rest of IT http://www.networkworld.com/newsletters/frame/2007/0827wan1.html

How the network team can show business value to the company


A real-life example of something gone wrong in senior management


Cisco, NetQoS move a step closer to integrated network optimization, management http://www.networkworld.com/newsletters/frame/2007/0730wan2.html


87


How optimizing the business network could help you optimize your career http://www.networkworld.com/newsletters/frame/2007/0730wan1.html

Senior management: Don’t manage each IT component in isolation

Using ITIL for better WAN management


What’s the goal of route analytics in the WAN?



Logical sources of performance and availability issues

How to fix application performance issues: Organize an IT pow-wow

The WAN and the wiki generation, Part 3


When apps are slow, net managers are wrong until proven right http://www.networkworld.com/newsletters/frame/2007/0702wan1.html

Don’t assume application performance problems are always network-related


What makes application management so hard to do?


NOCs now in charge of application management


Zero chance of interoperability between optimization technologies


Are you in a state of denial over the need for network, application optimization?


The benefits of managed service for applications performance


Microsoft SharePoint could be a challenge for WAN optimization


Juniper’s take on network optimization


The three components of application delivery according to Cisco http://www.networkworld.com/newsletters/frame/2007/0423wan2.html

How does your company handle recreational use of Internet resources?

http://www.networkworld.com/newsletters/frame/2007/0305wan1.html http://www.networkworld.com/newsletters/frame/2007/0226wan1.html





Survey: Organizations’ application delivery processes are ineffective http://www.networkworld.com/newsletters/frame/2007/0212wan2.html

Users formalizing processes to manage application performance


Successful application delivery


Benchmarking for WAN-vicious apps, Part 2


Benchmarking for WAN-vicious apps, Part 1


When YouTube presents true business value


Cisco analyst conference mulls super fast pipes vs. highly functional nets http://www.networkworld.com/newsletters/frame/2007/0108wan2.html

Does your WAN benefit your business’ business?


Morphing tactical solutions to becoming strategic ones


The benefits of thinking strategic when deploying network optimization http://www.networkworld.com/newsletters/frame/2006/1211wan2.html


Identify and then test

Which is most important to organizations - service delivery or service support?

WAN optimization tips




88


Insight from the road


What is termed network misuse in one company may not be so in another http://www.networkworld.com/newsletters/frame/2006/1113wan1.html

Naïve users who hog (or bring down) the network


Network managers reveal extent of network misuse on their nets


Network managers plugged into the importance of application delivery http://www.networkworld.com/newsletters/frame/2006/1023wan1.html

Cisco vs. Microsoft: The battle over the branch office, unified communications, and collaboration http://www.networkworld.com/newsletters/frame/2006/0925wan2.html

Cisco gets serious about application delivery


Application Acceleration that Focuses on the Application, Part 1 http://www.networkworld.com/newsletters/frame/2006/0821wan1.html


CIOs don’t take enough notice of application delivery issues http://www.networkworld.com/newsletters/frame/2006/0807wan2.html

WAN-vicious apps are a net manager’s worst nightmare http://www.networkworld.com/newsletters/frame/2006/0807wan1.html

Who in your company first notices when apps performance starts to degrade? http://www.networkworld.com/newsletters/frame/2006/0731wan2.html

When applications perform badly, is the CYA approach good enough?


Making sure the apps your senior managers care about work well over the WAN


Where best to implement network and application acceleration, Part 1 http://www.networkworld.com/newsletters/frame/2006/0327wan1.html

Where best to implement network and application acceleration, Part 2 http://www.networkworld.com/newsletters/frame/2006/0327wan2.html

A new convergence form brings together security and application acceleration


What makes for a next-generation application performance product? http://www.networkworld.com/newsletters/frame/2006/0220wan1.html

The limitations of today’s app acceleration products


What slows down app performance over WANs?


Automating application acceleration


Advancing the move to WAN management automation


Application benchmarking helps you to determine how apps will perform


How do you feel about one-box solutions?


Users don’t want WAN optimization tools that are complex to manage http://www.networkworld.com/newsletters/frame/2005/1121wan1.html

QoS, visibility and reporting are hot optimization techniques, users say


Survey finds users are becoming proactive with WAN mgmt.


Microsoft attempts to address CIFS’ limitations in R2


WAFS could answer CIFS’ limitations


Disgruntled users and the centralized data center


WAFS attempts to soothe the problems of running popular apps over WANs http://www.networkworld.com/newsletters/frame/2005/1031wan1.html

WAN optimization helps speed up data replication for global benefits firm http://www.networkworld.com/newsletters/frame/2005/1017wan1.html

Application accelerators take on various problems



89


The gap between networks and applications lingers


Is Cisco AON the new-age message broker?


Controlling TCP congestion


How TCP ensures smooth end-to-end performance


Mechanisms that directly influence network throughput


Increase bandwidth by controlling network misuse


Cisco’s FineGround buy signals big change in the WAN optimization sector


The thorny problem of supporting delay-sensitive Web services


Uncovering the sources of WAN connectivity delays


Why adding bandwidth does nothing to improve application performance http://www.networkworld.com/newsletters/frame/2005/0502wan1.html

Organizations are deploying MPLS and queuing for QoS, survey finds http://www.networkworld.com/newsletters/frame/2005/0425wan1.html

The trick of assigning network priority to application suites http://www.networkworld.com/newsletters/frame/2005/0418wan2.html

TCP acceleration and spoofing acknowledgements


How TCP acceleration could be used for WAN optimization http://www.networkworld.com/newsletters/frame/2005/0321wan1.html

Antidote for ‘chatty’ protocols: WAFS


How are you optimizing your branch-office WAN?


The benefits of thinking strategic when deploying network optimization http://www.networkworld.com/newsletters/frame/2006/1211wan2.html

Identify and then test


WAN optimization tips


Insight from the road


What is termed network misuse in one company may not be so in another http://www.networkworld.com/newsletters/frame/2006/1113wan1.html

Naïve users who hog (or bring down) the network


Network managers reveal extent of network misuse on their nets


Network managers plugged into the importance of application delivery http://www.networkworld.com/newsletters/frame/2006/1023wan1.html

Cisco vs. Microsoft: The battle over the branch office, unified communications, and collaboration http://www.networkworld.com/newsletters/frame/2006/0925wan2.html

Cisco gets serious about application delivery




CIOs don’t take enough notice of application delivery issues http://www.networkworld.com/newsletters/frame/2006/0807wan2.html

WAN-vicious apps are a net manager’s worst nightmare http://www.networkworld.com/newsletters/frame/2006/0807wan1.html

Who in your company first notices when apps performance starts to degrade?

What the next generation Web services mean to your WAN


Bandwidth vs. management: A careful balancing act

When applications perform badly, is the CYA approach good enough?



Morphing tactical solutions to becoming strategic ones

Making sure the apps your senior managers care about work well over the WAN




90


Where best to implement network and application acceleration, Part 1

WAN optimization helps speed up data replication for global benefits firm

Where best to implement network and application acceleration, Part 2

Application accelerators take on various problems



A new convergence form brings together security and application acceleration



The gap between networks and applications lingers



Is Cisco AON the new-age message broker?

What makes for a next-generation application performance product?

Controlling TCP congestion

The limitations of today’s app acceleration products

How TCP ensures smooth end-to-end performance

What slows down app performance over WANs?

Mechanisms that directly influence network throughput

Automating application acceleration

Increase bandwidth by controlling network misuse

Advancing the move to WAN management automation

Cisco’s FineGround buy signals big change in the WAN optimization sector


http://www.networkworld.com/newsletters/frame/2006/0213wan2.html http://www.networkworld.com/newsletters/frame/2006/0213wan1.html http://www.networkworld.com/newsletters/frame/2006/0130wan2.html http://www.networkworld.com/newsletters/frame/2006/0130wan1.html

Application benchmarking helps you to determine how apps will perform


How do you feel about one-box solutions?


Users don’t want WAN optimization tools that are complex to manage http://www.networkworld.com/newsletters/frame/2005/1121wan1.html

QoS, visibility and reporting are hot optimization techniques, users say


http://www.networkworld.com/newsletters/frame/2005/0718wan1.html http://www.networkworld.com/newsletters/frame/2005/0704wan1.html http://www.networkworld.com/newsletters/frame/2005/0627wan2.html http://www.networkworld.com/newsletters/frame/2005/0627wan1.html http://www.networkworld.com/newsletters/frame/2005/0620wan1.html


The thorny problem of supporting delay-sensitive Web services


Uncovering the sources of WAN connectivity delays


Why adding bandwidth does nothing to improve application performance http://www.networkworld.com/newsletters/frame/2005/0502wan1.html

Organizations are deploying MPLS and queuing for QoS, survey finds http://www.networkworld.com/newsletters/frame/2005/0425wan1.html

Survey finds users are becoming proactive with WAN mgmt.

The trick of assigning network priority to application suites

Microsoft attempts to address CIFS’ limitations in R2

TCP acceleration and spoofing acknowledgements

WAFS could answer CIFS’ limitations

How TCP acceleration could be used for WAN optimization

Disgruntled users and the centralized data center

Antidote for ‘chatty’ protocols: WAFS

WAFS attempts to soothe the problems of running popular apps over WANs

How are you optimizing your branch-office WAN?






91


What the next generation Web services mean to your WAN

The Hows and the Whys of Quality of Service


http://www.webtorials.com/main/resource/papers/netscout/briefs/ brief-07-07.htm

Bandwidth vs. management: A careful balancing act

Does IT Provide Business Value?



Kubernan Briefs

Rethinking MTTR

The Data Replication Bottleneck: Overcoming Out of Order and Lost Packets across the WAN http://www.webtorials.com/main/resource/papers/kubernan/brief-1-4.htm

The Business Value of Effective Infrastructure Management http://www.webtorials.com/main/resource/papers/flukenetworks/paper19.htm


Showing The Value of Network Management


Management and Application Delivery

Route Analytics: Poised to Cross the Chasm


The Integration of Management, Planning and Network Optimization

The Road to Successful Application Delivery

The Performance Management Mandate

WAN Vicious Applications

The Logical Causes of Application Degradation

The Movement to Implement ITIL

http://www.webtorials.com/main/resource/papers/kubernan/brief-1-3.htm

http://www.webtorials.com/main/resource/papers/kubernan/brief-1-2.htm http://www.webtorials.com/main/resource/papers/kubernan/brief-1-1.htm http://www.webtorials.com/main/resource/papers/packetdesign/paper10.htm

Eliminating The Roadblocks to Effectively Managing Application Performance

http://www.webtorials.com/main/resource/papers/kubernan/paper7.htm

Closing the WAN Intelligence Gap

http://www.webtorials.com/main/resource/papers/kubernan/ref1.htm

Taking Control of Secure Application Delivery

http://www.webtorials.com/main/resource/papers/kubernan/ref2.htm

http://www.webtorials.com/main/resource/papers/netscout/briefs/ brief-01-07.htm http://www.webtorials.com/main/resource/papers/netscout/briefs/ brief-11-06.htm http://www.webtorials.com/main/resource/papers/netscout/briefs/ brief-10-06.htm

Business Process Redesign


Network Misuse Revisited

http://www.webtorials.com/main/gold/netscout/it-impact/briefs.htm

Moving Past Static Performance Alarms

Supporting Server Consolidation Takes More than WAFS


Proactive WAN Application Optimization – A Reality Check

Analyzing the Conventional Wisdom of Network Management Industry Trends

http://www.webtorials.com/main/resource/papers/kubernan/ref2.htm http://www.webtorials.com/main/resource/papers/kubernan/ref4.htm

IT Impact Briefs The Value of Performance Management


Demonstrating the Value of Performance Management http://www.webtorials.com/main/resource/papers/netscout/briefs/ brief-09-07.htm

The Port 80 Black Hole



The Cost and Management Challenges of MPLS Services http://www.webtorials.com/main/resource/papers/netscout/briefs/ brief-05-06.htm

The Movement to Deploy MPLS

http://www.webtorials.com/main/resource/papers/netscout/briefs/04-06/ NetScout_iib_Metzler_0406_Deploy_MPLS.pdf

Managing VoIP Deployments

http://www.webtorials.com/main/resource/papers/netscout/briefs/03-06/ NetScout_iib_Metzler_0306_Managing_VoIP_Deployments.pdf


92


Network and Application Performance Alarms: What’s Really Going On?

http://www.webtorials.com/main/resource/papers/netscout/briefs/02-06/ NetScout_iib_Metzler_0206_Network_Application_Alarms.pdf

Netflow – Gaining Application Awareness

http://www.webtorials.com/main/resource/papers/netscout/briefs/01-06/ NetScout_iib_Metzler_0106_NetFlow_Application_Awareness.pdf

Management Issues in a Web Services Environment

http://www.webtorials.com/main/resource/papers/netscout/briefs/11-05/ Management_Web_Services.pdf

The Movement to Deploy Web Services

The Successful Deployment of VoIP

http://www.webtorials.com/abstracts/The%20Successful%20 Deployment%20of%20VoIP.htm

The Challenges of Managing in a Web Services Environment http://www.webtorials.com/abstracts/NetworkPhysics6.htm

Buyers Guide: Application Delivery Solutions http://www.webtorials.com/abstracts/SilverPeak3.htm

Best Security Practices for a Converged Environment http://www.webtorials.com/abstracts/Avaya27.htm

http://www.webtorials.com/main/resource/papers/netscout/briefs/10-05/ NetScout_iib_Metzler_1005_Web_Services_Deployment_SOA.pdf

Articles Contributed by the Sponsors

The Rapidly Evolving Data Center

Cisco

http://www.webtorials.com/main/resource/papers/netscout/briefs/09-05/ Data_Center.pdf

What’s Driving IT?

Overview of Cisco Application Delivery Network solutions www.cisco.com/go/ans

http://www.webtorials.com/main/resource/papers/netscout/briefs/08-05/ Whats_Driving_IT.pdf

http://www.cisco.com/en/US/netsol/ns340/ns394/ns224/ns377/networking_solutions_package.html

The Lack of Alignment in IT

Solutions Portals for Cisco Application Delivery Products

http://www.webtorials.com/main/resource/papers/netscout/briefs/07-05/ Lack_of_Alignment_in_IT.pdf

It’s the Application, Stupid

http://www.webtorials.com/main/resource/papers/netscout/briefs/0605/0605Application.pdf

Identifying Network Misuse

http://www.webtorials.com/main/resource/papers/netscout/ briefs/05-05/0505_Network_Misuse.pdf

www.cisco.com/go/waas www.cisco.com/go/ace

Cisco Data Center Assurance Program (DCAP) for Applications www.cisco.com/go/optimizemyapp www.cisco.com/go/oracle www.cisco.com/go/microsoft

Why Performance Management Matters

http://www.webtorials.com/abstracts/Why%20Performance%20 Management%20Matters.htm

Crafting SLAs for Private IP Services

http://www.webtorials.com/abstracts/Crafting%20SLAs%20for%20 Private%20IP%20Services.htm

White Papers Innovation in MPLS-Based Services www.kubernan.com

The Mandate to Implement Unified Performance Management www.kubernan.com

The Three Components of Optimizing WAN Bandwidth www.kubernan.com

Branch Office Networking

NetScout Network Performance Management Buyers Guide

http://www.netscout.com/library/buyersguide/default.asp?cpid=metzler-hbk

Guide for VoIP Performance Management in Converged Networks http://www.netscout.com/library/whitepapers/voip_best_practices. asp?cpid=metzler-hbk

Cutting through complexity of monitoring MPLS Networks

http://www.netscout.com/library/whitepapers/MPLS_multi_protocol_label_ switching.asp?cpid=metzler-hbk

Streamlining Network Troubleshooting - Monitoring and Continous Packet Capture Provides Top-Down Approach to Lower MTTR http://www.netscout.com/redirect_pdf/appnote_network_troubleshooting. asp?cpid=metzler-hbk

http://www.webtorials.com/abstracts/ITBB-BON-2004.htm


93


A Unique Approach for Reducing MTTR and Network Troubleshooting Time

http://www.netscout.com/redirect_pdf/appnote_tech_reducing_mttr. asp?cpid=metzler-hbk

NetQoS Network Performance Management News and Analysis http://www.networkperformancedaily.com

VoIP Quality of Experience

http://www.netqos.com/resourceroom/whitepapers/forms/voip_quality_experience.asp

Seven Ways to Improve Network Troubleshooting

http://www.netqos.com/resourceroom/whitepapers/forms/retrospective_ network_analysis.asp

Best Practices for Monitoring Business Transactions

Packeteer Gaining Visibility into Application and Network Behavior http://www.packeteer.com/resources/prod-sol/VisibilityDrillDown.pdf

Controlling WAN Bandwidth and Application Traffic (QOS Technologies): http://www.packeteer.com/resources/prod-sol/ControlDrillDown.pdf

Acceleration Technologies Overview

http://www.packeteer.com/resources/prod-sol/iSharedArchitectureWP.pdf

Intelligent LifeCycle for Delivering High Performance Networked Applications Over the WAN

http://www.packeteer.com/resources/prod-sol/Intelligent_LifeCycle_ Introduction.pdf

Akamai

http://www.netqos.com/resourceroom/whitepapers/forms/monitoring_business_transactions.asp

Akamai Web Application Accelerator

Network Diagnostics Provide Requisite Visibility for Managing Network Performance

IP Application Accelerator

http://www.akamai.com/waa_brochure http://www.akamai.com/ipa_brochure

http://www.netqos.com/resourceroom/whitepapers/forms/network_diagnostics.asp

Application Acceleration: Merits of Managed Services

Managing the Performance of Converged VoIP and Data Applications

Application Delivery – An Enhanced Internet Based Solution

http://www.akamai.com/managed_services

http://www.netqos.com/resourceroom/whitepapers/forms/voip.asp

http://www.akamai.com/ipa_whitepaper

Performance-First: Performance-Based Network Management Keeps Organizations Functioning at Optimum Levels

Shunra

http://www.netqos.com/resourceroom/whitepapers/forms/performancefirst.asp

6 Reasons to Allocate Network Costs Based on Usage http://www.netqos.com/solutions/allocate/whitepaper/allocate.html

Best Practices for NetFlow/IPFIX Analysis and Reporting

Testing Secure Enterprise SOA Applications across the WAN without Leaving the Lab http://www.shunra.com//whitepapers.aspx?whitepaperId=21

Predicting the Impact of Data Center Moves on Application Performance

http://www.netqos.com/resourceroom/whitepapers/forms/netflownew.asp

http://www.shunra.com//whitepapers.aspx?whitepaperId=10

MPLS Network Performance Assurance: Validating Your Carrier’s Service Level Claims

How to Optimize Application QoE Before Rollout

http://www.netqos.com/resourceroom/whitepapers/forms/trustverify.asp

It’s not the Network! Oh yeah? Prove IT!


The Virtual Enterprise: Eliminating the Risk of Delivering Distributed IT Services

http://www.netqos.com/resourceroom/whitepapers/forms/proveit.asp


Improve Networked Application Performance Through SLAs

NetPriva

http://www.netqos.com/resourceroom/whitepapers/forms/improvedSLAs.asp

Solving Application Response Time Problems - Metrics that Matter http://www.netqos.com/resourceroom/whitepapers/forms/metrics.asp

WAN Control for Microsoft Windows

http://www.netpriva.com/content/view/256/437/

End Point QoS Protection for Citrix


End Point QoS for Cisco WAAS



94


Strangeloop http://www.strangeloopnetworks.com/products/

ASP.NET Performance resources, including Optimizing ASP.NET applications for Performance and Scalability whitepaper and The Coming Web 2.0 train wreck – Jim Metzler http://www.strangeloopnetworks.com/products/resources/

AS1000 Appliance Datasheet -

http://www.strangeloopnetworks.com/files/PDF/products/Strangeloop_ AS1000_datasheet.pdf Network World Optimization - Specialist works to accelerate ASP.Net apps http://www.networkworld.com/newsletters/accel/2007/1105netop1.html

Ensuring your communications infrastructure becomes efficient and effective http://www.mnc.orange-business.com/content/pdf/OBS/library/white_ papers/wp_ensuring_your_comms_infrastructure.pdf

Business Acceleration provides insights and tools to enhance your applications’

http://www.mnc.orange-business.com/content/pdf/OBS/library/brochure/ broch_business_accmnc.pdf

A suite of services to improve visibility, management and performance of applications

http://www.orange-business.com/mnc/press/press_releases/2007/070307_ busi_acc.html

Palo Alto Networks Riverbed IDC applies their ROI methodology to our WDS solution in “Adding Business Value with Wide-Area Data Services.”

http://www.riverbed.com/lg/case_study_idc.php?FillCampaignId=701700000 00HMDM&FillLeadSource=Banner+Ad&FillSourceDetail=TEMPLATE_IDC+RO I+Case+Study&mtcCampaign=3721&mtcPromotion=%3E%3E

What are the key benefits of deploying a mobile WAN optimization solution? Find out in this new Forrester study “Optimizing Users and Applications in a Mobile World.”

http://www.riverbed.com/lg/white_paper_forrester.php?FillCampaignId=7017 0000000HOvK&FillLeadSource=Web&FillSourceDetail=&mtcCampaign=3721 &mtcPromotion=%3E%3E

Network World Names Riverbed Customer as Enterprise All-Star Award Winner http://www.riverbed.com/news/press_releases/press_112607.php

After 6-months of headaches with Cisco’s WAAS solution, 360 Architecture turns to Riverbed for a successful deployment.

http://www.byteandswitch.com/document.asp?doc_id=133898&page_number=2

Why Riverbed?

http://www.riverbed.com/solutions/

Orange What is Business acceleration?

http://www.orange-business.com/en/mnc/campaign/business_acceleration/ index.html

The Guoman and Thistle hotel group - Orange Business Service case study.

http://www.mnc.orange-business.com/content/pdf/OBS/library/case_studies/cs_Thistle_CST-015(2)PVA_Single.pdf

PA-4000 Series Firewalls

http://www.paloaltonetworks.com/products/pa4000.html

App-ID Classification Technology

http://www.paloaltonetworks.com/technology/appid.html

Panaorama Centralized Management

http://www.paloaltonetworks.com/products/panorama.html

Packet Design IP Route Analytics: A New Foundation for Modern Network Operations:

http://www.packetdesign.com/documents/IP%20Route%20Analytics%20 White%20Paper.pdf

Network-Wide IP Routing and Traffic Analysis: An Introduction to Traffic Explorer http://www.packetdesign.com/documents/Tex-WPv1.0.Pdf

Easing Data Center Migration with Traffic Explorer

http://www.packetdesign.com/documents/Easing_Data_Center_Migration_ with_Traffic_Explorer.pdf

Regaining MPLS VPN WAN Visibility with Route Analytics

http://www.packetdesign.com/documents/Regaining%20MPLS%20WAN%20 Visibility.pdf

Route Analysis for Converged Networks: Filling the Layer 3 Gap in VoIP Management

http://www.packetdesign.com/documents/Route-Analysis-VoIP-v1.0.pdf

Ipanema Maximizing Application Performance

http://www.ipanematech.com/New/DocUpload/Docupload/mc-ipasolution_ overview_en_070716_USFORMAT.pdf

Acceleration: Bottlenecks, pitfalls and tips

http://www.ipanematech.com/New/DocUpload/Docupload/mc_wp_ Acceleration_en_070604.pdf


95


Citrix

The need for application SLAs

http://www.ipanematech.com/New/DocUpload/Docupload/mc_wp_ ApplicationSLAs_en_070328.pdf

How Citrix’s NetScaler accelerates enterprise applications www.citrix.com/hqfastapps

Register to attend our free WEBCAST with GARTNER: “WAN Optimization: from a tactical to a strategic approach”.

Cirrix’s NetScaler

www.citrix.com/netscaler

http://www.ipanematech.com/emailing/gartner/Gartner-Ipanema-webcastNovember-2007.htm

Citrix’s WANScaler

www.citrix.com/wanscaler

Interviewees In order to gain additional insight application delivery from the perspective of IT organizations, a number of IT professionals were interviewed. The following table depicts the title of each of the interviewees, the type of industry that they work in, as well as how they are referred to in this report. Job Title COO

Industry Electronic Records Management Company

Reference The Electronics COO

Chief Architect

Entertainment

The Motion Picture Architect

CIO

Diverse Industrial

The Industrial CIO

Network Engineer

Automotive

The Automotive Network Engineer

Global Network Architect

Consulting

The Consulting Architect

Enterprise Architect

Application Service Provider (ASP)

The ASP Architect

Team Leader, Network Architecture, Strategy and Performance

Energy

The Team Leader

CIO

Engineering

The Engineering CIO

CEO

Mobile Software

The Mobile Software CEO

CTO


The Business Intelligence CTO

Director of IT Services

Insurance

The IT Services Director

CIO

Government

The Government CIO

Manager

Telecommunications

The Telecommunications Manager

Network Engineer

Financial

The Financial Engineer

Senior MIS Specialist

Manufacturing

The Manufacturing Specialist

Network Architect

Global Semiconductor Company

The Global Architect

Senior Director of IT

Medical

The Senior Director

Network Management Systems Manager

Conglomerate

The Management Systems Manager

CIO

Medical Supplies

The Medical Supplies CIO

Network Analyst

Manufacturing

The Manufacturing Analyst

Manager of Network Management and Security

Non-Profit

The Management and Security Manager

Director of Common Services

Education

The Education Director

Business Consultant

Storage

The Consultant

Manager of Network Services

Hosting

The Manager

Senior Consultant

Pharmaceutical

The Pharmaceutical Consultant

Manager of Network Administration

Medical

The Medical Manager

Appendix - Advertorials


96


Enabling

Application Delivery Networks Using Cisco’s Application Networking Services (ANS) and Application Fluent Foundation Networks to better optimize and enhance business applications

Overview The Application Networking Services (ANS) product portfolio, which consists of the Wide Area Application Services (WAAS), Application Control Engine (ACE) and ACE XML Gateway, is designed for enterprise, mid-market and service provider IT organizations who need to optimize and deliver business applications, such as ERP, CRM, websites / portals, and web services across the organization. The Cisco ANS portfolio, in conjunction with the application-fluency embedded in Cisco’s switching and routing foundation products, creates a true end-to-end Application Delivery Network, which are a set of Network-wide, integrated solutions that provide the availability, security, acceleration and visibility needed to ensure applications are successfully delivered. Application delivery networks from Cisco help IT departments accomplish the following objectives: • Deliver business and web applications to any user, anywhere with high performance • Centralize branch server and storage resources without compromising performance • Better utilize existing resources, including WAN bandwidth and server processing cycles • Rapidly deploy new applications over an integrated infrastructure.

Attributes of Cisco Application Delivery Networks Cisco has built and implemented the ANS portfolio to be deployed as part of the data center and branch infrastructure. This allows Cisco the unique capability to deliver on the following attributes, which are critical to an application delivery network’s success:

Adaptability Cisco has integrated application delivery functionality directly into the core switching and routing infrastructure, allowing customers to rapidly adapt to changing application types, traffic patterns and volume across the network. While taking advantage of the foundation elements and integration benefits, the Cisco ANS solution incrementally delivers: • Virtualization – an often used, but misunderstood term, Cisco has created secure, virtual partitions in both the switching, routing, and application delivery controller products. In the Application Control Engine (ACE) for example, an IT manager can very rapidly partition off a new application on the same device which remains completely separate (on the data, control and management plane) and, more importantly, secure from other partitions. • License-based performance – many Cisco ANS products can grow in performance simply by purchasing a software license that can easily and rapidly be installed to support additional demand, without the need to upgrade or deploy new devices • Scalable solutions – for the highest possible scalability, the ANS products support multi-device tested and documented solutions. Examples include the possibility to add up to 4 active ACE Modules within the same Catalyst chassis, or the possibility to scale a WAAS design due to the integration with ACE. • Transparent solutions – network services can be added (or subtracted) without major overhauls or reconfigurations to the current network, such as security. This is crucially important as services such as voice are added to the existing data network.


97


Extensibility

Integration

The network system underlying the application must be designed to easily allow additions of new capabilities into the overall system. Tying application delivery functionality into various places in the network ensures the most optimal application delivery and the best user experience. Let’s take an example.

These capabilities are all critical to the delivery of applications from the data center to users; however, the mechanism by how they’re delivered is important as well. The creation of an overlay network – for server load balancing or WAN optimization, for instance – adds complexity, increases management burdens, and adds to the alreadyovercrowded data center. Additionally, an overlay network does not provide any hooks into the existing network infrastructure that can make application delivery more effective. These include:

• Application recognition – the first step towards optimizing the delivery of the application is to recognize it. Using both IP and application recognition mechanisms integrated into the Catalyst 6500 or the Cisco routers, the network can classify, then optimize or re-priorize the application throughout the network • Server Scaling – in order to better scale the ability of the application to delivery email, the Application Control Engine can be used not only to provide server load balancing, but “last-line-of-defense” security. Client sessions can now be scaled across multiple servers. Additionally, by offloading functions such as SSL, URL filtering and parsing and TCP into the network, the server can be better utilized, ensuring rapid content delivery of email traffic • WAN Optimization – The user at the branch office will be limited by the amount of bandwidth available on the WAN. Using the Wide Area Application Services (WAAS) solution, the network can optimize the delivery of that traffic over the WAN using innovative TCP optimization, compression and data redundancy elimination. This provides a solution for a complete range of TCP-based applications with the simple addition of policies. These demonstrate how broad the solution needs to be, but what about how “deep”? Cisco’s solutions contain customizable elements that allow the network to respond to new threats, functionalities and capabilities as they become available. The Cisco ACE XML Gateway, for example, supports programmable plug-ins, so that new functionality can be developed and integrated into the network. For example, Cisco offers plug-ins to exchange authentication messages with Identity Management systems or detect dangerous packets through the RegX capabilities in ACE.

• Integration of application intelligence into the foundation, such as with Network Based Application Recognition in the Catalyst 6500 Supervisor 32 engine, • Integration of application switching and firewalling into the Catalyst 6500 though the Application Control Engine (ACE) and Firewall Services Module (FWSM) • Integration of the WAAS solution, while it can be purchased in an appliance form factor, is integrated into the Cisco Integrated Services Routers (ISR) as well as into Cisco IOS software. • Using ACE virtualization, in conjunction with virtual LANs (VLANs) on the Catalyst 6500, a secure, virtually partitioned network can be created for a new application on a common platform, without having to purchase a new device. • The Catalyst 6500 can detect local physical connectivity failure and can inform the ACE module to perform a failover to a secondary device • ACE can map a virtual partition to a virtual route forwarding (VRF) engine, extending a “private” network through to the data center • WAAS can be inserted transparently into the existing network, meaning that no changes to quality-ofservice or security access control lists are needed. • The WAAS transparency elements allow the network to deliver valuable statistics through NetFlow to network management and monitoring tools.


98


How Cisco’s ANS Portfolio Enables the Application Delivery Network By integrating into the Cisco data center, branch and campus foundation, the Cisco ANS portfolio delivers against the key challenges for application owners, IT management and end users. These challenges include security, availability, acceleration and visibility.

Security Cisco delivers the strongest end-to-end security in order to protect hosts, applications and network elements. The Cisco network can adapt to all attack types, from networklevel Layer 2 to application level, including XML, attacks. Cisco delivers integrated security, including embedded firewall and intrusion protection services on the Cisco ISR routers as well as a high-speed firewall in the Catalyst 6500. Security features in the ACE module provide the last line of defense prior to reaching the physical server itself, as well as application-level security both for clientto-application access and for server-to-server communication in multi-tier designs. Finally, the ACE XML Gateway (AXG) can be leveraged to protect XML-based applications from newer attacks.

Availability Availability is table-stakes in application delivery networking. Cisco has designed its application delivery solution to adapt to all types of failures, recovering and/ or working around the failure, whether it’s a networking, server or application component failure. For example, the ACE Global Site Selector ensures users can be re-routed to a secondary data center should the primary become congested or fail. Within the data center, the ACE module works with the Catalyst 6500 to ensure route health in the network and reroute around failures, while also supporting active-active stateful resiliency between ACE modules to ensure application sessions remain intact.

Acceleration Cisco delivers the most complete set of application acceleration features in order to provide performance improvements for all clients and applications. The network adapts to all types of applications, such as generic TCPbased applications, with ACE for load balancing and server

offload, and WAAS for WAN acceleration, which can yield up to 100x end user performance improvements. In conjunction with specific application optimization engines for protocols like CIFS, WAAS can add additional benefits specific for each application. Conversely, Cisco can optimize specific HTTP-based sessions with ACE web acceleration features. Additionally, Cisco is working with application vendors, such as Microsoft, to better optimize specific applications, such as Exchange, over the WAN and data center networks.

Visibility Visibility is an often-overlooked element in the deployment of an application delivery network, yet it is often the most critical as it is how network and application managers monitor performance and end user productivity. Through integration of application delivery functions into the network, Cisco can provide the most complete visibility solution today. Starting with application recognition in the Catalyst 6500 and ISR routers, network managers can understand (and limit) the traffic coming into the network. NetFlow statistics can be reported and monitored at all places in the network. Transparency throughout the network assures a uniform measurement, including across the WAN (where many WAN optimization solutions change IP header information). Finally, through partners such as NetQoS, Cisco can deliver end-to-end performance visibility to truly gauge how applications are operating over the application delivery network, and what end users are experiencing.

Conclusion The network has played, and will continue to play, an ever-increasing role in delivering IT services to users, enabling a business to increase efficiency and deliver results. By combining the application-fluent network foundation – Cisco’s industry-leading switches and routers – and the Application Networking Services capabilities, Cisco provides the most complete, most manageable, and most integrated application delivery network solution available today.

www.cisco.com


99


Take the Lead on

WAN Application Delivery by Putting Performance First

IT organizations have spent billions of dollars implementing fault management tools and processes to maximize network availability. While availability management is critical, infrastructure reliability has improved to a point where 99.9% availability is not uncommon. At the same time, application performance issues are growing dramatically due to trends such as data center consolidation, the rise of voice and multimedia traffic, and growing numbers of remote users. Relying solely on infrastructure availability and utilization is no longer enough to address these challenges, especially as network professionals are becoming increasingly responsible for application delivery across the network infrastructure. According to Aberdeen Group’s 2007 research report, The Real Value of Network Visibility, “Network performance projects are no longer cost centers; they are becoming the major components of enterprise strategies for better customer service, profitability, and revenue growth.”

The performance-first paradigm inverts the traditional device monitoring approach and begins with top-down visibility into application performance. This is driven by the fundamental purpose of the network—to transport data from one end of the system to the other as rapidly as possible. The more efficiently data flows at the transport layer the better the application performance. Application response time response time is the best measurement to use when deciding how to optimize the network, plan new infrastructure rollouts and upgrades, and identify the severity and pervasiveness of problems. A performance-first approach starts with measuring endto-end response times to get an overall view, and expands other key performance metrics as needed, including VoIP quality of experience, traffic flow analysis, device performance, and long-term packet capture.

The Performance-first Imperative Network engineers and managers must take the lead on application delivery with a performance-first approach to managing their complex networks. By shifting the focus from fault management—which is largely under control— to performance-based management, network professionals can concentrate on how the network is affecting service delivery and make themselves more relevant to the business units they serve. “Performance is the thousand shades of gray between those red and green lights indicating whether devices are up or down. IT professionals must switch from an up/ down model of network management to a performancefirst model to quantify end user experience,” said Joel Trammell, CEO of Austin, Texas-based NetQoS, Inc., a provider of network performance management software and services to Global 4000 companies, service providers and government organizations.

Performance-first management starts with measuring application response times and drilling into other metrics as needed, including VoIP quality, traffic flow analysis, device performance, and long-term packet capture.


100


Application Response Time Monitoring:

Long-term Packet Capture and Analysis:

Understanding application response time baselines is the essential starting point for making strategic decisions about network performance and application delivery. The deconstruction of total end-user transaction time allows network professionals to tie end-user performance back to the IT infrastructure and analyze the behavior of networks, servers, and applications. For instance, application response time monitoring enables network professionals to decide where WAN optimization and application acceleration technologies are most needed and to measure the before and after impact.

When problems do occur, engineers need to view and analyze detailed packet-level information before, during, and after the problem. Once the packets are stored, the data can be analyzed for actionable information to solve problems quickly, without having to recreate the problem.

VoIP Quality of Experience: With the introduction of VoIP applications on enterprise networks, a performance-first approach must include monitoring call quality and the impact of convergence across all application performance. Network-based call setup and call quality monitoring enable the network team to track the user quality of experience and isolate performance issues to speed troubleshooting.

Network Traffic Analysis:

The Dashboard for Performance-first Management Recognizing the absence of tools to support the performance-first management paradigm, NetQoS set out to fill the void. With top-down performance analysis spanning enterprise views of the end-user experience down to deep-packet inspection, the NetQoS Performance Center provides global visibility into the core metrics needed to sustain and optimize application delivery. The NetQoS Performance center offers best-in-class modules that scale to support the world’s largest and most complex networks and leverage industry standard instrumentation without the use of desktop or server agents.

Advantages of Performance-first Management

With end-to-end performance metrics captured and the source of latency isolated, further analysis is much more focused. Traffic analysis enables network engineers to understand the composition of traffic on specific links where latency is higher than normal or expected. This yields the information needed to redirect or reprioritize application traffic and do capacity planning. In addition, visibility into new or anomalous traffic patterns pinpoints performance problems and identifies security risks.

Survey results from Aberdeen indicate that organizations taking a performance-centric approach achieve superior application and network performance and more cost-efficient operations than their peers. In fact, those organizations reported a 92 percent success rate in resolving issues with application performance before end users are impacted, compared to a 40 percent success rate for all the others.

Device Performance Management:

Other important benefits of performance-first network management include:

Managing network infrastructure, devices, and services is also a critical component of the performance-first approach, for both short-term troubleshooting and longterm planning. For instance, if end-to-end performance monitoring shows the source of latency is isolated to an infrastructure component—a busy router or a server memory leak, for example—network professionals need device performance management capabilities to poll the device in question and find the root cause so that corrective action can be taken.

Deliver consistent application performance and measure it: Without real-time visibility into end user response times, traffic flows, and infrastructure health, it’s impossible to manage application performance proactively. Too often IT managers have no way of knowing how well their organization or service provider is meeting its performance targets.


101


This was the case for NetQoS customer Network Instruments, a leading provider of computer-based measurement, automation and embedded applications. National Instruments delivers applications to more than 4,500 employees in 40 countries via two main data centers and its global MPLS network. Prior to implementing NetQoS products, National Instruments had followed the traditional network management approach of focusing on availability of network devices. “Without NetQoS, we were flying blind,” said Sohail Bhamani, senior network engineer at National Instruments. “When we would deploy a new application or make some sweeping change to the network, we would have no idea what impact it had except for availability.” Faced with implementing a major Oracle upgrade worldwide, the network team knew that successful application delivery depended on their ability to measure network and application performance, not just availability. With NetQoS, Bhamini states “NetQoS Performance Center gives us the insight we need into application performance across our network and combines end-to-end response time and traffic flow reports that are easy to create and use.” Make more informed infrastructure investments: When infrastructure managers make uninformed upgrade decisions, the cost can be high. Often the anticipated results don’t materialize, ROI is poor, and performance problems persist. London-based ICI Group, one of the world’s largest producers of specialty products and paints, deployed the NetQoS Performance Center to gain visibility into network traffic from 320 locations in 46 countries. “NetQoS Performance Center management reports give us centralized visibility into traffic patterns across our global locations,” said Hitesh Parmer, ICI Group’s global network technology manager. “This data enables us to justify bandwidth upgrades, reduce network costs by identifying underutilized sites, troubleshoot issues faster, and analyze the impact of new application rollouts to prioritize traffic appropriately.”

Work collaboratively and more effectively to reduce MTTR: Network managers need tools that give them realtime global visibility and historical information to optimize the network infrastructure for application performance and work with peer groups to plan for changes. Also, knowing the source of problems means the right technicians may be assigned immediately to address problems quickly. “The correlation of the end-to-end response time and traffic analysis data in a consolidated dashboard via the NetQoS Performance Center allows us to more effectively monitor issues and quickly resolve them,” said Philip Potloff, Executive Director of Infrastructure Operations for Edmunds.com, the premier online resource for automotive information based in Santa Monica, CA. “We are building custom dashboards for each of our IT groups, including network engineering, application operations, systems administration, and database administrators, to have their own views into the NetQoS Performance Center data.” Columbus, Ohio based Belron US, a multi-faceted automotive glass and claims management service organization, deployed NetQoS to deliver optimal network and application performance to more than 7000 employees and more than two million customers across 50 states. “We saw immediate value when we deployed NetQoS products about two years ago, especially in identifying the source of problems for faster and more accurate troubleshooting,” said Gary Lewis, Belron’s manager of data networking and security. “We continue to rely on the products to give us the data we need for troubleshooting, capacity planning, and maintaining application service levels.”

Conclusion IT organizations can no longer manage networks in isolation from the applications they support, requiring a shift from a device-centric to a performance-centric focus. By measuring how networked applications perform under normal circumstances, understanding how performance is impacted by infrastructure and application changes, and isolating the sources of above-normal latency, IT organizations can ensure problems are resolved quickly, mitigate risk, and take measured steps to optimize application performance.


102


NetQoS is the only company that delivers a comprehensive network performance management suite for applying the performance-first approach. NetQoS products are used to manage large enterprise, service provider and government networks, including a majority of the world’s 100 largest companies. Representative customers include: AIG; Avnet; Barclays Global Investors; Bed, Bath, & Beyond; Chevron; Deutsche Telekom; NASA, Schlumberger; Turner Broadcasting Systems; and Verizon.

www.netqos.com/


103


Reduce MTTR

with a Problem Lifecycle Management Plan

Introduction The increasing size and scope of today’s mission-critical enterprise networks means that users are more dependent on uptime and availability. When they operate at less-thanpeak performance, it costs enterprises a fortune in lost productivity. The size and complexity of these networked applications makes identifying and resolving performance issues that much more difficult. IT organizations need both best practices processes and comprehensive network and application performance management solutions to quickly find and fix problems.

Discovering problems earlier helps reduce MTTR

into more damaging stages where it impacts end users and customers. Traditionally, once users call their help desks and report the problem, the IT organization initiates processes to identify the root cause, implement corrective actions, and verifies its effectiveness. Ideally, identifying problems and diagnosing the cause in this earliest stage will reduce MTTR and minimize business interruption. NetScout’s Sniffer and nGenius solutions are designed to aid in early detection and rapid diagnosis of problems impacting networked applications, thus improving IT productivity and application delivery to users.

Problem Lifecycle Evolution Staying ahead of such problems means developing an aggressive strategy to identify, pinpoint root cause, and remediate user-affecting issues.

Since problems that degrade the performance of networked application follow a typical lifecycle troubleshooting them can also follow a standardized four stage process: Network and application problems experience a rather predictable lifecycle where the problem originates and is minimally invasive to users and business processes. However, in many cases, the problem continues to grow

Detect – Identify a problem as soon as possible Diagnose –Evaluate the problem and determine its’ likely causes. This evaluation is the basis for


104


formulating and evaluating a course of corrective actions to resolve the problem. Verify –Assess network and application performance to validate that the actions taken have resolved the initial problem. Monitor – Ongoing performance management gains long-term visibility into networked application traffic, providing the proactive detection of new problems.

The Best Practices Approach to Problem Lifecycle Management The steps to an effective lifecycle management process can be address in much more detail for detecting, diagnosing, and verifying corrective actions to resolve problems:

Step 1: Detection Answers the question, “Is there a problem in our network?” Often, small problems that go unnoticed or unchecked gain in size and impact over time until multiple users are affected, business is impacted, revenue is lost and customers vanish. The addition of latency-intolerant applications, e.g. VoIP, IP video, and other time-sensitive applications e.g. multicast updates for stock prices, makes it essential that network and application problems are discovered in the earliest stages.

ing their help desk or IT staff to report it. In fact, Ashton, Metzler & Associates and NetScout Systems surveyed more than 230 networking professionals in the spring of 2007 and more than 75% of the respondents indicated it was their end users complaints that alerted them to new problems in their network. However, ideally, many IT organizations desire a more proactive means of identifying emerging troubles. Continuous monitoring of networked application traffic flows for trending, alarming and analytical evidence will produce key performance indicators such as: • Immediate notification of link or application utilization increases to pinpoint a growing congestion problem before it impacts the other traffic or end users • Identification of new applications added to the network to further evaluate if the capacity in key parts of the network can handle the additional load without degrading other business applications or converged services such as VoIP • Metrics, also known as Key Performance Indicators (KPIs), related to quality of user experience such as abnormal increases in average application response times or changes in VoIP quality indicators like unac-

Who Finds Performance Problems Discovering a problem emerging with networked applications has often been first identified by end users experiencing the problem and then call-

Note: Respondents may have chosen more than one answer Source - Ashton, Metzler & Associates with NetScout Systems – Spring 2007 Survey of 232 networking professionals


105


ceptable jitter, packet loss or MOS scores help IT organizations to quickly realize an emerging problem and rapidly locate the source of network degradations before users notice quality issues. Having the actual packet flow data from your network available in real-time, as well as, trended in historical reports, helps to quickly ascertain if a problem is emerging. It also directly enables all appropriate personnel to collaborate on the next phases – diagnosis and alternative corrective actions.

Step 2: Diagnosis Answers the question, “What is the root cause of the problem?” IT organizations will start their diagnostic analysis by asking “What are the symptoms of the network or application problem?” Following a top-down approach to troubleshooting the issue, leveraging the best packet-flow data available to solve the most complex problems, IT staff can begin by using early warning alerts, analyze the applications in use contributing to the alerts, the conversations between clients and data center application servers, the volume and utilization of these applications and conversations, as well as other resources vying for the same bandwidth. As the complexity of converged networks and virtualized application services makes troubleshooting an order of magnitude more challenging, contextual drill downs to the actual packets at the center of the problem is essential. Using playback analysis to reconstruct the problem and the network conditions at the time of the problem with expert analysis makes diagnosing even the most elusive of problems a more successful proposition. Analysis in real-time or over time should reveal: • The networked applications traversing the enterprise network to uncover dangerous or unapproved traffic, e.g.: viruses or streaming internet radio that might degrade performance of business critical applications • All applications, both inbound and outbound, within assigned QoS classes to discover possible mis-configurations that might be causing delivery issues

• Upward or downward trends in traffic flows, which will allow the IT team to “right-size” bandwidth more accurately and timely • Sophisticated bounce charts, decode filters, and playback analysis based on a continuous stream of recorded packets to troubleshoot the most persistent, complex problems affecting business applications. Asking IT to run a converged network without this information is like trying to navigate a foreign city without a GPS system–there will be many guesses and wrong turns. Valuable time is lost going down the wrong path guessing at the problem and possible actions to rectify it. One enterprise feared their mission-critical application server backups using multicast were impacting voice quality and turned off these backups for a period of time only to discover that the problem did not disappear. Proper diagnosis will depend on rich application and conversation flow visibility and when necessary, the actual packet evidence for in-depth forensics analysis.

Step 3: Verification To determine “Have we resolved the issue?” Based on the proper diagnosis, a corrective action plan needs to be determined. This step may take some time, first to determine the corrective action and then to implement and verify it. Alternatives may include adding bandwidth, correcting mis-configurations, or working with a third-party application developer to improve the architecture of a poorly written, chatty application. The performance management system will confirm if the actions taken have satisfactorily restored service levels, such as: • Returned response times for key business applications to acceptable levels. • Improved VoIP quality of experience by analyzing metrics such as jitter, packet loss and MOS scores • Restored application and conversation levels to within previous baselines.


106


What should be accomplished in this phase is confirmation that the problem is in fact corrected and the application services are not suffering any further issues.

Step 4: Ongoing Management By analyzing “How is our network growing and changing over time?” The correction of a problem, as well as a widespread launch and subsequent rollout of any new initiative, such as consolidating data centers, upgrading business applications, migrating to a MPLS WAN, or introducing VoIP services to remote offices, marks the start of ongoing management. Ongoing management continues the detection and diagnostic activities already discussed to maintain a positive performance and user quality of experience across the overall projects and associated networked services. Broad elements of ongoing management include: • Proactive alarms and warnings of potential problems before they have the chance to have a noticeable effect on user experience

• Access to real-time data for troubleshooting detected issues as they emerge • Analysis in historical reports of trended data for planning and traffic engineering Many IT organizations will find themselves serving as a sort of “referee” between business use and recreational use of the network, or between bandwidth hungry services like video teleconferencing and time sensitive, existing business applications. However, with in-depth application visibility from packet flow analysis, they will be able to recognize degradations early, rectify bandwidth contention issues quickly, and initiate traffic engineering changes or schedule disruptive traffic to traverse the network off hours. Reducing user-affecting problems, or at least improving mean time to resolve problems that do occur, is the bottom line with ongoing management.

Having a real effect on reducing MTTR Companies who have already implemented strong performance management solutions are already realizing

Source: March 2007 Ashton, Metzler and Assoc. Survey 138 participants


107


significant time savings in reducing mean time to restore quality application delivery. When some manufacturing organizations report one hour of downtime represents $12 million in lost production, or a stock exchange processing hundreds of thousands of transactions per hour, it is easy to accept the need to reduce the time it takes to resolve network and application degradations. By following the process outlined in this discussion and leveraging the most in-depth, packet flow analysis and evidence from their network, other organizations have already had demonstrable improvement in their troubleshooting time. A recent survey shows dramatic reduction in MTTR.

Summary Good network and application performance depends on a structured approach to performance management throughout a problem’s lifecycle. Using NetScout’s nGenius and Sniffer network and application performance management solutions that leverage analysis of rich packet flow data you too can detect, diagnose, verify, and manage the projects and problems that may arise in the future. And this is an essential step in reducing MTTR and helping to maintain high quality service delivery throughout your enterprise community.

http://netscout.com/


108


Delivering the Network That Thinks Enterprises continue to invest heavily in IT systems,

cut costs and simplify system management, they found

networks and application software – hundreds of billions

that users accessing files over WAN links experienced

every year. Yet despite these investments, the assurance

severe degradation in response – not only for file access,

of application performance continues to elude IT manag-

but for ERP applications that shared the same link.

ers. When you look at your organization – how are your infrastructure investments fairing? Are they delivering the assurance your business requires? Even as IT performance challenges persist, emerging trends promise to open up new worlds of opportunity and efficiency. • Unified Communications enable the seamless integration of presence, voice, communications, and data integration. • Service Oriented Architectures (SOA) create a more flexible framework for leveraging and distributing information between systems and people. • Virtualization promises to free logic and resources from physical constraints. • Wireless technologies melt away the barriers of location. Yet as these emerging architectures are embraced by

The Challenge: How to Align Business Objectives with the Entire Infrastructure Too often, decisions optimize for a narrow view. In the example above, a decision to optimize servers cost the whole organization. Whether it is an underpowered server farm, a network out of capacity, a poorly written application – when ERP doesn’t work, it’s the business that suffers. • Application performance can slow the business. When applications slow down, the business slows down. Value creation is lost and IT is forced to react simply to restore normal operations. From voice to video conferencing, ERP to financial transactions, the applications – and potential for issues – are diverse. • Silos within the IT organization diffuse responsibility. Applications require systems, networks

business, they create additional strain on infrastructure and open up new requirements for effective management. Are you ready for these new opportunities? Can you effectively utilize and manage them while assuring the delivery of services your business demands? It is no trivial task. If we look at recent examples like branch office server consolidation, we can see how good intentions quickly turn into performance nightmares. When a global 100 company consolidated branch office servers into data centers to get better control over data,


109


and application logic to work together, but many IT

across the entire system. For example, track the

organizations aren’t built that way. Communication

total delay of specific application transactions, such

barriers, narrow views and self-interest plague our

as an SAP Order Entry operation…and isolate net-

ability to resolve problems and plan effectively.

work and server issues that emerge; or monitor the

• Fragmented tool base mirrors the organization and creates islands of optimization. Too often

Mean Opinion Score (MOS) of your IP Telephony – across your entire system, in real-time.

we deploy point tools to solve one issue, unaware

• Adaptive response & optimization for changing

of related challenges that need to be managed in

conditions. Beyond the brute basics of accelera-

other areas. Why deploy network probes to diag-

tion, caching, and compression, we automatically

nose a specific issue before considering a more

respond at the application level, dynamically allocat-

complete solution, which integrates better visibility

ing resources to the most critical needs – like auto-

AND the ability to resolve issues as a complete

matically constricting recreational traffic to assure

solution the entire organization benefits from?

your next voice call receives the service level it

• An application-centric and network-delivered world. 90% of workers reside outside of HQ

requires – to achieve your business outcomes. • Centralized, intelligent system to assure perfor-

and the data center. As you deliver business

mance. Delivering the Network That Thinks means

globally, performance issues almost always involve

providing a powerful, centralized view into your

the distributed network – but it’s the applications

network via IntelligenceCenter. With role based

that run the business. The challenge is to bridge

portals, you can create custom views for your voice

the gap between the network and applications to

group, NOC group, or ERP applications group. Our

assure the effective delivery of services.

extensible framework interconnects with leading

Intelligent Service Assurance – Delivering The Network That Thinks

framework vendors to feed our real time application view into a heterogeneous environment.

Packeteer delivers the network that thinks about applications, systems, users…, and the business objectives you’re trying to achieve. Our integrated system provides unique intelligence to identify issues, the technologies to assure performance, and the ability to communicate and integrate across the organization. • An application fluent perspective of the network. Automatically discover every application, measure utilization and provide 100+ statistics to deliver the information needed to intelligently find, fix and proactively manage issues. • A unified view of end user performance. Provide a real-time, end-to-end view of key applications


110


The Solution; Start with Intelligence Today, Packeteer unifies all the tools required to find, optimize and assure the performance of all applications in the network as the foundation for Intelligent Service Assurance. Whether your infrastructure challenges are immediate to support high quality voice, video, file, ERP, CRM, transactions and other key business applications, or you are planning ahead for virtualization, consolidation, unified communications and SOA strategies, Packeteer’s vision for Intelligent Service Assurance means we will continue to provide the strategic solution to deliver your desired business outcome.

A Best Practices Approach to Performance Issues Helping you sort through hundreds of applications, focus on the key issues, and determine what tools to employ to fix problems requires a strategic partner for your success. Packeteer’s Intelligent LifeCycle exemplifies our commitment by providing a guide to help you navigate the complexity of application delivery and sustain the

Accelerate: Overcome latency & protocol issues to enhance performance and capacity. Extend: Create an intelligent overlay that adapts current infrastructure to new and emerging issues and integrates into key service management frameworks.

level of service performance your business demands. The approach is a simple series of four steps that begins with intelligence and leads to high performance application delivery. Assess: Identify what applications are running on the network, what approaches to take to resolve issues, and continuously monitor performance. Provision: Create policies to align network resources with the business – protect key applications, contain problematic traffic.


111


The Results: Applying Packeteer Technologies to Achieve Performance Results Packeteer delivers a full spectrum of integrated solutions to optimize application performance, rather than a narrow set of point performance tools. Whether facing strategic changes in your infrastructure or looking to immediately resolve a pain point, these examples attest to the gains you can achieve with Packeteer solutions:* Application

Results

Primary Technologies Employed

IP Telephony & Video

• Monitor MOS scores in real time • Reduce jitter by 60%1 • Reduce video conference session setup by 70%2

• IPT/VoIP Quality Monitoring • Application classification • Per call Application QoS

ERP/CRM & Line of Business Applications

• Track end user response times • Accelerate performance by 75-95%1 • Reduce WAN bandwidth by 70-98%

• • • •

File Access (CIFS)

• Accelerate file access by 98% • Reduce bandwidth by up to 97% • Reduce storage by 2 TB3

• Wide Area File Services (WAFS) • CIFS Acceleration • Document caching

SharePoint & SMS Traffic

• Accelerate SharePoint access times by 90% • Reduce bandwidth by 50-95%+

• Protocol acceleration • Bulk caching • Local SMS delivery

Server Consolidation Services

• Reduce WAN bandwidth by 80-100%

• Local delivery of services: print, SMS, DNS/DHCP

Recreational Traffic

• Contain to 2-10% during peak usage • Allow usage as excess bandwidth available

• Traffic AutoDiscovery & Application Classification • Application QoS

Malicious Traffic

• Identify infected hosts • Contain propagation traffic • Maintain WAN availability

• Application classification • Connection diagnostics • Application QoS

1

Inergy Automotive Case Study

2

Logitech Case Study

3

Nortel Case Study

Response time monitoring Application QoS Compression TCP acceleration for higher latency links

* Performance results may differ based on a mix of variables such as applications, users, usage patterns and WAN link speeds.

For more information, contact us at: Email: [email protected] Web: www.packeteer.com Phone: 800-440-5035 (North America) 408-873-4400 (International)


112


Transforming the Internet into a Business-Ready

Application Delivery Platform Ensuring application performance supports your business goals. As organizations expand globally, they need to make a variety of business-critical applications – including extranet portals, sales order processing, supply chain management, product lifecycle management, customer relationship management, VPNs, and voice over IP – available to employees, business partners and customers throughout the world. These organizations must also be sensitive to the economic pressures driving IT consolidation and centralization initiatives. Though global delivery of enterprise applications provides remote users with essential business capabilities, poor application performance can quickly turn the user experience into a costly, productivity-sapping exercise. Business applications must perform quickly, securely, and reliably every time. If they don’t, application use and adoption will suffer, threatening not just the benefits linked to the applications, but the overall success of the business, itself.

Key Challenges in Delivering Applications When delivering applications via the Internet to their global user communities, businesses face significant challenges, such as poor performance due to high latency, spotty application availability caused by high packet loss, and inadequate application scalability to deal with growing user bases and spiky peak usage. Each of these problems severely undermines the application’s effectiveness and the company’s return on investment. The performance issues associated with the Internet are not new. They are, however, having more of an

impact because of business trends, such as globalization. Because of its lower cost, quick time to deploy, and expansive reach, IT organizations are increasingly turning to the Internet to support their globalization efforts. At the same time, chatty protocols like HTTP and XML, are introducing additional performance issues.

Akamai’s Application Performance Solutions Today, more than 2,500 businesses trust Akamai to distribute and accelerate their content, applications, and business processes. Akamai’s Application Performance Solutions (APS) are a portfolio of fully managed services that are designed to accelerate performance and improve reliability of any application delivered over the Internet – with no significant IT infrastructure investment. Akamai leverages a highly distributed global footprint of 28,000 servers, ensuring that users are always in close proximity to the Akamai network. Application performance improvements are gained through several Akamai technologies, including SureRoute route optimization, which ensures that traffic is always sent over the fastest path; the Akamai Protocol, a high performance transport protocol that reduces the number of round trips over the optimized path; caching and compression techniques; and packet loss reduction to maintain high reliability and availability. APS comprises two solutions – Web Application Accelerator and IP Application Accelerator. Web Application Accelerator accelerates dynamic, highly-interactive Web applications securely, resulting in greater adoption through improved performance, higher availability, and an enhanced user experience. It ensures consistent application perfor-


113


Akamai’s distributed EdgePlatform for application delivery

mance, regardless of where users are located, and delivers capacity on demand, where and when it’s needed. IP Application Accelerator, like Web Application Accelerator, is built on an optimized architecture for delivering all classes of applications to the extended enterprise, ensuring increased application performance and availability for remote wireline and wireless users. Applications delivered by any protocol running over IP, such as SSL, IPSec, UDP and Citrix ICA will benefit from IP Application Accelerator. Examples of applications delivered by APS include Web-based enterprise applications, Software as a Service (SaaS), Web services, client/server or virtualized versions of enterprise business processes, Voice over Internet Protocol, live chat, productivity, and administration functions, such as secure file transfers. Akamai APS also addresses performance problems associated with the delivery of applications to wireless handheld devices, such as PDAs and smart phones.

Customer Benefits Akamai’s Application Performance Solutions offer a number of performance and business benefits: • Superior Application Performance - Akamai provides unsurpassed application performance by accelerating cacheable and dynamic content. • Superior Application Availability - Akamai’s provides unique protections to ensure that Internet unreliability never gets in the way of end user access to your application. Users are dynamically mapped to edge servers, in real-time, based on considerations including Internet conditions, server proximity, content availability and server load, to ensure that the user are served successfully and with minimal latency. • Rigorous Application Security – Akamai adheres to stringent security requirements for network, software, and procedures.


114


• Complete Visibility and Flexible Control Akamai’s EdgeControl Management Center provides clear and effective tools that allow IT to manage and optimize their extended application infrastructure. In addition to sophisticated historical reporting and real-time monitoring functionality, Akamai also provides alert capabilities that notify IT when origin site problems are detected or user performance may have degraded. In addition, a sophisticated secure Network Operations Command Center continually monitors Akamai’s global distributed network.

About Akamai Akamai is the leading global service provider for accelerating content and business processes online. Thousands of organizations have formed trusted relationships with Akamai, improving their revenue and reducing costs by maximizing the performance of their online businesses. Leveraging the Akamai EdgePlatform, these organizations gain business advantage today, and have the foundation for the emerging Internet solutions of tomorrow.

http://www.akamai.com

Akamai’s EdgeControl Management Center provides real-time reporting


115


Next Generation

Web Application Delivery

Accelerating, securing and ensuring the availability of critical business applications

The ubiquity of the Web simplifies many aspects of delivering application services. However, inherent performance and security inefficiencies of networking protocols negatively impact the user experience. Application users need quick response times, improved availability, and application-layer security for the mission-critical Web applications used today. Web application delivery appliances provide advanced application-level acceleration, availability and security functionality to address these issues while lowering the cost of delivering these applications. To deliver true business value, an application delivery appliance must: • make the applications perform faster • enhance the applications’ security • improve the applications’ availability An application delivery appliance needs to understand the behavior of the applications themselves. Bridging the gap between the business’s applications and the underlying network/infrastructure – in essence directing network behavior based upon an application’s behavior – is at the core of an application delivery appliance’s responsibility. This is, of course, impossible if the application’s behavior is opaque to the application delivery appliance. The proliferation of new, advanced web application development techniques and associated new protocols and formats (e.g. Web 2.0, Ajax, XML, RSS, wikis. etc.) make understanding applications even more important. Next generation application delivery appliances need to be application aware.

Comprehensive Application Delivery Functionality Citrix Systems, the global leader in application delivery infrastructure, provides application delivery appliances that combine application intelligence with a high degree of networking savvy. Citrix NetScaler’s success is based on its ability to integrate multiple acceleration, availability and security functions – at both the networking and the

application layers – into a single, integrated appliance. Only Citrix NetScaler provides all of the following into a single, integrated appliance:

Accelerated Application Performance Citrix NetScaler can increase application performance by 5X or more. Citrix® AppCompress™ improves end-user performance and reduces bandwidth consumption by compressing Web application data, regardless of whether it is encrypted or unencrypted. Citrix® AppCache® speeds content delivery to users by providing fast, in-memory caching of both static and dynamically generated HTTP application content. In addition, Citrix NetScaler delivers multiple TCP optimizations to improve the performance of the network and server infrastructure.

Intelligent Load Balancing and Content Switching NetScaler delivers fine-grained direction of client requests to ensure optimal distribution of traffic. In addition to layer 4 information (protocol and port number), traffic management policies for TCP applications can be based upon any application-layer content. Administrators can granularly segment application traffic based upon information contained within an HTTP request body or TCP payload, as well as L4-7 header information such as URL, application data type or cookie. Numerous load balancing algorithms and extensive server health checks provide greater application availability by ensuring client requests are directed only to correctly behaving servers.

Comprehensive Application Security Citrix NetScaler appliances integrate comprehensive Web application firewall inspections that protect Web applications from application-layer attacks such as SQL injection, cross-site scripting, forceful browsing and cookie poisoning. By inspecting both requests and responses at the application layer, Citrix NetScaler blocks attacks that


116


are not even detected by traditional network security products. Application-layer security prevents theft and leakage of valuable corporate and customer data, and aids in complying with regulatory mandates such as the Payment Card Industry Data Security Standard (PCI-DSS). In addition, Citrix NetScaler appliances include highperformance, built-in defenses against denial of service (DoS) attacks. Content inspection capabilities enable Citrix NetScaler to identify and block application-based attacks such as GET floods and site-scraping attacks. However, not all increases in traffic are DoS attacks. Legitimate surges in application traffic that would otherwise overwhelm application servers are automatically handled with configurable Surge Protection and Priority Queuing features.

End-user Experience Visibility Citrix NetScaler integrates Citrix EdgeSight™ for NetScaler end-user experience monitoring, providing page-level visibility of Web application performance. EdgeSight for NetScaler transparently instruments HTML pages, monitoring Web page response time from the application users’ perspective. Response time measurements are combined with detailed statistics on the trip durations of requests and responses across the Web site infrastructure, providing granular visibility into how Web applications are behaving from the end user’s perspective.

SSL Acceleration NetScaler integrates hardware-based SSL acceleration to offload the compute-intensive processes of SSL connection set-up and bulk encryption from Web servers. SSL acceleration reduces CPU utilization on servers, freeing server resources for other tasks. Citrix NetScaler is also available in a FIPS-compliant model that provides secure key generation and storage.

Reduced Deployment and Operating Costs Citrix NetScaler cuts application delivery costs by reducing the number of required servers and by optimizing usage of available network bandwidth. The intuitive AppExpert Visual Policy Builder enables application delivery policies to be created without the need for coding complex programs or scripts. In addition, NetScaler reduces ongoing operational

expenses by consolidating multiple capabilities such as content compression, content caching, application security and SSL offload into a single integrated solution. For managing multiple NetScaler appliances, the separately available Citrix Command Center provides centralized administration of multiple NetScaler Appliances enabling more efficient system configuration, event management, performance management and SSL certificate administration.

Application Tested Citrix NetScaler has demonstrated improvements that not only address traditional server availability concerns, but also accelerate application performance. Testing done in conjunction with Microsoft, SAP, Oracle, ESRI and others have shown tangible benefits. For example, Microsoft® SharePoint™ experienced up to an 82% reduction in latency for various workflows. SAP’s CRM software suite showed a response time improvement of roughly 80% with CPU utilization dropping nearly 60%. Hyperion demonstrated the performance, interoperability, and ease of deployment of their enterprise applications with Citrix NetScaler appliances; application acceleration of up to 30X has been achieved. ESRI lab test results further demonstrate the performance advantages of Citrix NetScaler, with improved throughput of 67% and a 40% reduction in application response times.

Summary Citrix NetScaler appliances enable the network to bring direct business value to the business’s application portfolio. Citrix NetScaler appliances are purpose-built to speed Web application performance by up to 5 times or more. NetScaler tightly integrates proven protection for Web applications against today’s most dangerous security threats, protecting against the theft and leakage of valuable corporate and customer information and aiding in compliance with security regulations such as PCI-DSS. NetScaler enables IT organizations to improve resource efficiencies and simplify management while consolidating data center infrastructure. The Citrix NetScaler family of Web application delivery systems is a comprehensive approach to optimizing the delivery of business resources in a fully integrated solution.

www.citrix.com


117


WAN Optimization: from a tactical to a strategic approach

Executive Summary The WAN is critical to the business of modern enterprises. Despite technological progress such as MPLS and xDSL, WAN bandwidth is a constrained resource and network delay is bound by physical constraints. The need for WAN Optimization Controllers (WOCs) has emerged over the past few years as a way to address application performance hurdles in selected sections of the network. Enterprises today tactically deploy WAN Optimization for networked business applications in sites that show poor end-user experience. Such an opportunistic approach has great advantages, as in a significant number of cases it permits the end-user to gain immediate benefits in sites where the technology is deployed. But not all networks are compatible with such a tactical approach to application performance. Most Enterprises, especially large ones require WAN Optimization benefits to be delivered globally to consistently serve the quality of experience requirements of their whole distributed workforce. But very few large Enterprises have generalized the deployment of WOCs in their network. There are four key reasons for that: • Scalability - Many vendors today are able to enhance application performance on 10 or 20 sites. A select few are able to scale WOC benefits to hundreds or thousands of sites. • Management costs - WOCs are high-tech devices that need to be individually configured. Each device needs to be configured with each other and yet, all must reflect local requirements. • Efficiency - Putting WOCs in selected sites in a large network often does not improve performance. Modern networks have meshed topologies that cannot be properly handled by traditional WOCs.

• Investment costs - Even if the technology tends to be more affordable, WOCs still cost many times more than the cost of a branch router.

All Ipanema’s customers are globally deploying WAN Optimization!!

The Ipanema Business Network Optimization solution has been specifically designed for strategic deployments of WAN Optimization. The Ipanema Business Network Optimization solution is scalable. Ipanema customers typically deploy the solution to cover the need of tens, hundreds and thousands sites.

The Ipanema Business Network Optimization solution shows dramatically low management costs. Ipanema’s objective based approach that automates configuration of devices and leads to management costs that are nearly independent from the number of deployed devices!

One to Any

Any to Any

Single Data Center

Multiple Data Centers

Some to Any

Multiple Data Centers

Branch Offices

Branch Offices w/ inter-site traffic

Tele-managed site (unequipped) Physical device

Branch Offices

Real-time cooperation

1


119


The Ipanema Business Network Optimization solution is efficient on the most complex networks.

The Ipanema solution integrates all these features to address application performance over the entire network.

The Ipanema solution’s devices communicate together to synchronize the actions taken on WAN traffic. This leads to total control of network performance. All traffic flows are managed to handle both competition at a site, like in a hub and spoke topology, and also between sites, like in modern any-to-any MPLS networks.

1. Visibility

The Ipanema Business Network Optimization solution enables innovative and cost effective deployment options. The Ipanema solution supports progressive deployments of devices and features. For example, companies can obtain visibility and guarantee flow performance over their whole network by only putting devices in datacenters. A first level of acceleration can even be obtained using our patented asymmetrical TCP acceleration, which does not require any device at the branch.

The Ipanema solution’s Visibility features enable full control over application behavior on the network.

Visibility features allow the end-user to: • Automatically discover applications over the entire network using Layer 3 to 7 Classification, • Accurately measure the performance of all application flows in real-time, • Report on usage and performance throughout the organization, • Combine proactive and reactive helpdesk functions via bird’s-eye-views of performance (Maps), alarming and drill down.

Ipanema Technologies provides a Business Network Optimization solution that automatically manages and maximizes WAN application performance through the combined use of Visibility, Optimization, Acceleration and Network Application Governance features.

2. Optimization The Optimization features allow the performance guarantee of critical applications under all circumstances.

Optimization features allow the end-user to: The Ipanema Business Network Optimization solution bridges the gap between the enterprise’s business priorities and the WAN infrastructure. It is the only automated and scalable solution that adapts to any network condition to deliver the 3 key components of WAN application performance management: the ability to control network and application behavior, to guarantee the performance of critical applications under all circumstances and to accelerate business applications everywhere.

• Define Application Performance Objectives per user and enforce them globally over the WAN, • Guarantee the performance of critical applications under the toughest conditions, • Globally manage meshed flows with Cooperative Tele-Optimization,


120


• Dynamically protect interactive applications and enable voice/video/data convergence with Smart Packet Forwarding, • Automatically select the best access link with Objective-Based Routing. Sites

Applications

1

2

3

4

5

Sites 6

7

8

1

2

3

4

5

6

7

8

Business Criticality

SAP

• Rightsize the bandwidth according to the desired service levels, • Simplify change management, accelerate operations and minimize TCO, • Allocate responsibilities between the WAN and IT domains, • Encourage good practices through cost allocation based on usage and delivered performance.

Citrix (CRM) VOIP Oracle Citrix (MS Office) Other FTP

Exhaustive, per flow measurement of real traffic

E-mail

Critical app. users satisfaction

53%

100%

Overall users satisfaction

57%

78%

Before Optimization

After Optimization

Application flow quality metrics

3. Acceleration Acceleration features reduce the response time of applications over the WAN.

Acceleration features allow the end-user to: • Accelerate while protecting critical application performance, • Implement both a strategic and tactical approach to Acceleration, • Unleash TCP acceleration without branch devices using Tele-Acceleration, • Locally cache and compress data using Multi-Level Redundancy Elimination, • Transparently Accelerate legacy applications using Intelligent Protocol Transformation.

4. Network Application Governance Network Application Governance functions are unique to Ipanema. Because of its design, Ipanema’s solution allows network managers to concentrate for the first time on highlevel activities that make it easy to deliver on the network’s perennial promise to be a strategic business asset.

Network Application Governance features allow the end-user to: • Enable the shift to Application Service Level Agreements,

Active, per link testing (PING, SAA …) Link quality metrics

AQS MOS

AQS > 9 during 99% of the time MOS > 4 during 99% of the time

Application quality indicators

Application performance SLAs

Link delay < 50 ms during 99% of the time Link loss < 1% during 99% of the time Link performance SLAs

About Ipanema Technologies Ipanema Technologies is a provider of advanced application traffic management solutions that align the network with business goals. Network integrators market Ipanema’s Business Network Optimization solutions to enterprises, while telecom service providers and network managed service providers offer those as a service. The Business Network Optimization solution is simple, automated and scalable and allows enterprises to easily control, guarantee and accelerate the performance of their critical applications regardless of network conditions. It relies on Ipanema’s Autonomic Networking System to provide full Visibility of application flows over the network, global and dynamic Optimization of network resources, transparent application Acceleration, Network Application Governance functions and Scalable Service Delivery capabilities. Ipanema’s Business Network Optimization solutions are deployed in more than 75 countries.

For more information, visit www.ipanematech.com.


121


Software at the Endpoint -

When Every Second Counts..

Guaranteeing application performance… through visibility and control at the end point…

The problem of congestion on wide area networks (WAN) that service users in remote and branch offices is a growing one with ever more bandwidth hungry applications, for example voice and video, competing for the available bandwidth resource. Until now, the problem of identifying and managing the causes of congestion has been tackled using specialized network appliances, also known as “WAN traffic optimization” (WTO) appliances. The issue with these is that they are not economical or easily manageable to suit the majority of small remote office and branch office locations. NetPriva has developed a totally new software based solution approach to the problem of WAN congestion which involves no appliance equipment, and is economical for even the smallest of offices. The NetPriva solution suits enterprises for their enterprise WAN, network service providers for their customers, and Internet portals that service business (MSPs) and other users that depend on network responsiveness.

Competition for bandwidth Applications that business depends on such as SAP, Citrix, PeopleSoftand others are time sensitive in that they must be delivered to users with consistent response times for them to use and rely upon them. Other time sensitive examples include voice or video traffic where even a slight disruption results in an unacceptable user experience, and defeats the purpose of using the wide area network to streamline operations or reduce costs. Non time sensitive business application examples include Emails, file transfers or downloads or synchronization. Without rules both time and non time sensitive traffic compete for the available bandwidth on a “first come, first served” basis and the experience is that users of time

sensitive applications lose productivity, complain of poor response, and add to help desk queues.

Current solution approaches Typical network performance “degradation” type solution approaches have focused on network appliances, also known as WAN traffic optimization (WTO) appliances, including: • Appliances that monitor and report network traffic statistics. This is a reactive solution approach. • Appliances that provide real time network traffic monitoring and Quality of Service (QoS) management installed at points in the network where major traffic congestion occurs.This is a proactive approach but not economical for smaller remote and branch offices. • Appliances that accelerate particular type(s) of traffic. While these can be said to be proactive, they only work for some types of traffic. They may not assist interactive business transaction traffic (there is little if any data to compress and little if any duplication in such data). Furthermore, at times of congestion if there is no effective QoS control in place, the target type(s) of data may not be accelerated in actual practice. The expense of these appliances also means they tend to be limited to use at data centre locations or on high speed links. Users in smaller remote offices miss out. • Multi function appliances. Vendors are developing appliances also known as the “branch office box” (BOB) that provide a range of functions some of which may be surplus and it is still an appliance with cost issues


122


• Software “clients” for user PCs to reduce the data on slow network links. This is aimed at individual users “on the road”. These are limited “point solutions” for some wide area network users. These solutions are again not effective at times of network congestion.

New Software Solution Platform NetPriva has developed new generation software platform for cost effective wide area network performance management for all size branch offices in an enterprise. • It provides monitoring/visibility and proactive policy based network traffic control (QoS). • It automatically classifies all types of traffic, including custom applications and even encrypted data. • It has been designed to be extensible to integrate with or complement traffic acceleration methods. • An appliance is not required as the software utilizes an existing MS Routing, ICS or VPN server. • For additional ease of use and visibility, the software can also use the capacity of user PCs in remote or branch office locations. Full visibility, management and QoS control can be had by simply installing NetPriva software on an existing “gateway” device such as a Microsoft ICS server or a VPN server. and managed through a web browser or PC based Console. It can be remotely installed and managed as part of a standard operating environment (SOE) on user PCs anywhere 24x7. Multiple locations, users, and traffic policies are managed via the web portal. Policy design and deployment is quick and easy, making for a cost effective solution for even the smallest remote or branch offices. The NetPriva software platform provides a comprehensive range of filtering, monitoring, and shaping capabilities thatprovide visibility and control to match or exceed that of many of the WAN traffic optimization appliances.

NetPriva’s software solution achieves its highest level of visibility and control through unique peer-topeer network management, and also eliminates the “man-in the middle” issues that appliances typically suffer from through not being able to positively or economically identify compressed or encrypted traffic.

NetPriva software functionality Network layer 3 and 4 visibility and shaping When deployed at the edge of the network, where the LAN links to the WAN, the NetPriva software filters and provides monitoring and shaping control by IP address, Port number, Protocol, URL, and Citrix ICA tag.

Network layer 7++ visibility and shaping When deployed at network end points (servers or user PCs) the NetPriva software provides filtering to monitor and shape traffic at the application layer, i.e. Layer 7 in addition to Layers 3 and 4 as described above. NetPriva terms this capability Layer 7++ on the basis of the additional facility to monitor and shape by application executable name, and user login as well as the capability to classify any traffic including custom applications and encrypted traffic. The basis of NetPriva’s Layer 7++ functionality lies in the fact that the NetPriva software does its work at the very end point of the network where network traffic originates and terminates, that is inside the user’s PC. There, the NetPriva software can absolutely determine the identity of any application and user without the limitations and resource usage issues of interpreting matching data patterns as is the case with deep packet inspection techniques. In addition, NetPriva’s Layer 7++ functionality includes the identification of encrypted traffic which is a growth area in traffic. And its ability to identify any traffic includes the growing amount and variety of peer to peer traffic much of which may be unsanctioned from a business point of view. Peer to peer traffic, like any traffic, can be identified even if it is masquerading as some form of legitimate traffic.


123


DSCP Packet marking / colouring The NetPriva agent based software is able to mark or colour traffic by DiffServ code point according to the policy for each application. This can be used to “groom” particular traffic from the user’s desktop for routing purposes such as for MPLS Class of Service. This has the potential to extend the MPLS Class of Service model right to the end points of the network.

Analysis and reporting NetPriva holds granular network statistics at the level of per second application flow details in an SQL data base format. Data may be retained on line according to resource limitations or retention policies. Data schema details are provided for SQL queries and data base extracts and reporting.

Products End Point Direct (“EPDirect”): Client side software for true endpoint control.

Edge Virtual Gateway (“EdgeVG”): Branch office server software for full visibility, management and control (QoS) through a branch office “gateway” server.

http://www.netpriva.com

Product / Characteristic

EdgeVG Application Software (Windows)

Typical location(s)

Smaller office network edge points Shared server / “gateway” PC, such as Windows ICS/RAS, VPN connected to LAN/WAN Layer 3 routing host (Windows RAS or ICS) Ethernet

Configuration Network type device Connectivity Key functions: Traffic monitoring / shaping Real time visibility and trouble shooting Statistics retention, analysis, capacity planning WAN Optimization Components: QOS Engine Collector Management Console Policy settings: Priority Min. Max. Bandwidth Layer 3, 4 Layer 7++ automation Citrix ICA Packet marking(Diffserv) Drop packet conditions Shape or monitor only Statistics retention Application / user identification

EP Direct End Point (Windows) Smaller office network end points End Point User Windows PCs on LAN End Point User Windows PCs on LAN Ethernet

Y Y

Y Y

Y Y Future

Y Y TCP Flow Optimization V4.4b

Local “always on” PC or server Local “always on” PC or server(or remote server) Local or remote PC/Server

On each end point PC – user PC Local “always on” PC or server (or remote server) Local or remote PC/Server

Y Y Y With EPDirect Clients Y Y Y Y Y

N Y Y Y Y Y Y Y Y

Layers 3 and 4 (IP address, port), URL or URL substring via DPI, Citrix ICA priory tag

Layers 3 and 4 (IP address, port), URL or URL substring,user, automatic Layer 7++ traffic identification for all applications


124


be competitive

we help you perform at peak performance running on your network. We conduct assessments of your underlying infrastructure, applications and processes. Then our consultants help you understand the impact of your usage patterns on the performance levels of your applications.

To enhance the efficiency levels required for multinational enterprises to successfully face competitive challenges, Orange Business Services offers Business Acceleration - a full suite of services that brings focus, control and greater speed to those applications that are central to consistent delivery of quality services, wherever your business operates around the globe. You gain world-class management of your end-to-end communications and application environment along with a service level agreement that ensures the performance of your business.

we can get your business moving Business Acceleration provides insights and tools to enhance your applications’ visibility, management and performance. Our approach aligns your business and IT so everything is working optimally. We help you: • analyze: gain visibility into your communications infrastructure and business-critical applications • manage: ensure the efficient and consistent operation of your infrastructure and application environment • optimize: get the most of your infrastructure and improve the performance of your business

analyze Business Acceleration begins by helping you gain end-toend visibility into your communications infrastructure and business-critical applications. First, we show you what’s

our Business Acceleration methodology

We understand information communications and technology. We know what drives change in IT organizations: mergers and divestments, new product releases, a variety of end-user profiles, cost-reduction initiatives, changing application requirements, new technologies and evolving business strategies. Working with you, we define a business case, quick wins and a transition plan to meet your service assurance expectations, network optimization, application performance and strategic business objectives. Our recommendations and guidance promote a better end-user experience as well as a measurable return on investment.

analyze • benchmark and align IT transactions to business priorities • model your end-to-end environment in order to study various change scenarios • validate application and infrastructure performance individually and holistically


125


manage With Business Acceleration your infrastructure and application environment runs efficiently and harmoniously. You gain control when you have the ability to proactively manage performance and network resources.

leave the day-to-day administration to us and focus your resources on growing your business.

We allocate bandwidth according to your business priorities and implement policies governing your applications globally. Ongoing monitoring and alerts ensure quality of service is maintained by our three major service centers to ensure that end-user expectations are met or exceeded.

consistent quality Our service management improves the operational efficiency of your application environment through specialized personal support and delivers global operational monitoring, detailed analysis and monthly reporting. To guarantee the best possible service, we create an application-based service level agreement so you get the performance you need. Covering all critical applications, the service level agreement directly addresses your performance requirements on response times and availability.

manage • dashboards for ongoing performance improvements through QoS and web traffic prioritization • service level management including monitoring, reporting and real-time fault management for improved control application SLAs

optimize The third phase of Business Acceleration approaches optimization from two angles – application and infrastructure. Leveraging techniques such as caching, compression and acceleration, we make your applications perform as efficiently as possible. This means lower response times and increased availability so your business can operate at full speed. Through consolidation, we simplify your infrastructure to reduce your operational costs and ease ongoing management. You gain more control of your environment while achieving consistently high service levels. We also protect your infrastructure from attacks and threats to ensure resiliency and availability. With server, storage and application management services, you can

IT infrastructure optimization

the result Business Acceleration gives you an overall improvement in application experience backed by an application-level service level agreement. It maximizes your existing investment in applications and infrastructure, ensuring that employees are more productive and business goals are met.

why Orange Business Services? As a global integrated operator, we can bring your business to a new level of performance. We can improve the visibility, management and performance of your applications with single-source convenience, no matter where your business takes you.

optimize • compression to maximize bandwidth use and throughput • caching at appropriate regional locations to minimize application latency • consolidation of servers, applications and network equipment

http://www.orange-business.com/


126


Network-Wide

Routing and Traffic Analysis Packet Design Solutions: Packet Design’s IP routing and traffic analysis solutions empower network management best practices in the world’s largest and most critical enterprise, Service Provider and Government OSPF, IS-IS, BGP, EIGRP and RFC2547bis MPLS VPN networks, enabling network managers to maximize network assets, streamline network operations, and increase application and service up-time.

Route Explorer: Industry-Leading Route Analytics Solution Optimize IP Networks with Route Explorer • Gain visibility into the root cause of a signification percentage of application performance problems. • Prevent costly misconfigurations • Ensure network resiliency • Increase IT’s accuracy, confidence and responsiveness • Speed troubleshooting of the hardest IP problems • Empower routing operations best practices • Complement change control processes with realtime validation of routing behavior

Deployed in the world’s largest IP networks 250+ of the world’s largest enterprises, service providers, government and military agencies and educational institutions use Packet Design’s route analytics technology to optimize their IP networks

Overview of Route Explorer Route Explorer works by passively monitoring the routing protocol exchanges (e.g. OSPF, EIGRP, IS-IS, BGP, RFC2547bis MPLS VPNs) between routers on the network, then computing a real-time, network wide topology that can be visualized, analyzed and serve as the basis for actionable alerts and reports. This approach provides the most accurate, real-time view of how the network is directing traffic. Unstable routes and other anomalies – undetectable by SNMP-based management tools because they are not device-specific problems – are immediately visible. As the network-wide topology is monitored and updated, Route Explorer records every routing event in a local data store. An animated historical playback feature lets the operator diagnose inconsistent and hard-to-detect problems by “rewinding” the network to a previous point in time. Histograms displaying past routing activity allow the network engineer to quickly go back to the time when a specific problem occurred, while letting them step through individual routing events to discover the root cause of the problem. Engineers can model failure scenarios and routing metric changes on the as-running network topology. Traps and alerts allow integration with existing network management solutions. Route Explorer appears to the network simply as another router, though it forwards no traffic and is neither a bottleneck or failure point. Since it works by monitoring the routing control plane, it does not poll any devices and adds no overhead to the network. A single appliance can support any size IP network, no matter how large or highly subdivided into separate areas.


127


Route Explorer User

IGP Routing Adjacencies BGP Routing Adjacencies

BGP

AS 4 (EIGRP) BGP

Route Reflector

BGP

BGP

AS 1 (OSPF)

AS 3 (OSPF)

AS 2 (IS-IS)

Route Explorer passively listens to routing protocol exchanges to provide visibility into network-wide routing across Autonomous Systems, areas, and protocols.

Traffic Explorer: Network-Wide, Integrated Traffic and Route Analysis and Modeling Solution Optimize IP Networks with Traffic Explorer • Monitor critical traffic dynamics across all IP network links • Operational planning and modeling based on realtime, network-wide routing and traffic intelligence • IGP and BGP-aware peering and transit analysis • Visualize impact of routing failures/changes on traffic • Departmental traffic usage and accounting • Network-wide capacity planning • Enhance change control processes with real-time validation of routing and traffic behavior

Traffic Explorer Architecture: Traffic Explorer consists of three components: • Flow Recorders: Collect Netflow information gathered from key traffic source points and summarize traffic flows based on routable network addresses received from Route Explorer • Flow Analyzer: Aggregates summarized flow information from Flow Recorders, and calculates traffic distribution and link utilization across all routes and links on the network. Stores replayable traffic history • Modeling Engine: Provides a full suite of monitoring, alerting, analysis, and modeling capabilities

Traffic Explorer Applications Data Center Migration Simulation and Analysis: Traffic Explorer ensures application performance by increasing the accuracy of network planning when moving server clusters between data cen-


128


ters by simulating and analyzing precisely how traffic patterns will change across the entire network and identifying resulting congestion hot spots. Disaster Recovery Planning: Traffic Explorer increase can simulate link failure scenarios and analyze continuity of secondary routes and utilization of secondary and network-wide links Forensic Troubleshooting: Traffic Explorer increases application performance by speeding troubleshooting with a complete routing and traffic forensic history.

Overview of Traffic Explorer Traffic Explorer is the first solution to combine real-time, integrated routing and traffic monitoring and analysis, with “what-if” modeliing capabilities. Unlike previous traffic analysis tools that only provide localized, link by link traffic visibility, Traffic Explorer’s knowledge of IP routing enables visibility into network-wide routing and traffic behavior. Powerful “what-if” modeling capabilities empower net-

Data Center

Flow 1 Src Dest . . . . . . Netflow Data

Internet

work managers with new options for optimizing network service delivery. Traffic Explorer delivers the industry’s only integrated analysis of network-wide routing and traffic dynamics. Standard reports and threshold-based alerts help engineers track significant routing and utilization changes in the network. An interactive topology map and deep, drill-down tabular views allow engineers to quickly perform root cause analysis of important network changes, including the routed path for any flow, network-wide traffic impact of any routing changes or failures, and the number of flows and hops affected. This information helps operators prioritize their response to those situations with the greatest impact on services. Traffic Explorer provides extensive “what-if” planning features to enhance ongoing network operations best practices. Traffic Explorer lets engineers model changes on the “as running” network, using the actual routed topology and traffic loads. Engineers can simulate a broad range of changes, such as adding or failing routers, interfaces and peerings; moving or changing prefixes; and adjusting IGP metrics, BGP policy configurations, link capacities or traffic loads. Simulating the affect of these changes on the actual network results in faster, more accurate network operations and optimal use of existing assets, leading to reduced capital and operational costs and enhance service delivery.

http://www.packetdesign.com Email: [email protected]

Flow 1 Src Dest . . . Flow 2 Src Dest . . . Netflow Data

Traﬃc Explorer User

• Collects Netflow data exported from routers at key traffic sources (e.g. data center(s), internet gateways, WAN links) • Computes traffic flows across network topology using routing data from Route Explorer • Displays, reports and enables modeling based on actual network-wide routing and traffic data


129


Next Generation Firewalls Providing visibility and control of users and applications

Introduction IT administrators today are faced with an application landscape that has evolved in a dramatic fashion. Enduser applications that are being installed on the network have been designed specifically to act evasively, avoiding network detection and associated security. Even wellmeaning corporate applications utilize similar tactics to accelerate deployment, facilitate wide spread access and minimize disruption. IT administrators know that there are applications on their network that their network security infrastructure cannot identify, and that without application visibility it is not possible to effectively control traffic on the network. The ramifications resulting from this inability to identify and control the applications traversing the network range from benign to serious. End-user productivity, bandwidth consumption, PC performance degradation due to nonwork related processing are just a few of the relatively benign ramifications that administrators face. The more threatening ramifications include regulatory compliance, information leakage, and hackers looking for financial gain through the theft of personal information, passwords and corporate information. IT administrators are managing as best they can with a patchwork of existing technologies. What’s needed is a fresh approach to the firewall, one that takes an application-centric approach to traffic classification and is capable of bringing policy-based application control back to the network security team.

A Fresh Approach to Network Security In order to keep pace with the evolving application landscape, administrators are coming to the stark realization that only a fresh approach will enable them to accurately identify and therefore control all application traffic flowing

in and out of the network. Palo Alto Networks is taking a new approach to build a solution for today’s network security needs: • The solution starts with network traffic classification that identifies the actual application irrespective of port, protocol, or evasive tactic. All traffic on all ports is classified in this way, providing application identification as a comprehensive visibility foundation for all security functions to leverage. • Policy-based decryption, identification and control of SSL traffic provides visibility into one of the largest blind spots on the network today. The policy controls enable gradual introduction of SSL decryption as well as granular enforcement of corporate policy. • Graphical visualization and policy control of application usage. Simple and intuitive visualization tools provide visibility into the traffic currently on the network, helping to set appropriate application use policy. Application policy control includes allowing, blocking, controlling file transfers, marking for QoS, and inspecting traffic for viruses, spyware, and vulnerability exploits. • Real-time protection from threats embedded in applications allows network-based threat prevention without impact to user experience. Rather than using multiple threat prevention devices that often proxy file transfers to look for viruses and spyware, Palo Alto Networks utilizes a single, hardware accelerated prevention engine supported by a common signature format to detect a wide range of malware and threats. • Rounding out this fresh approach is a purpose-built high speed platform that makes it possible to provide visibility and control for all applications on all ports. The platform utilizes different processing technologies, applied to specific functions, comple-


130


mented by large amounts of RAM to maintain multigigabit throughput and low latency even under load with all functions turned on.

Palo Alto Networks PA-4000 Series Palo Alto Networks is taking a fresh approach to deliver a next-generation firewall that classifies traffic from an application-centric perspective, thereby enabling organizations to accurately identify and control applications flowing in and out of the network. The Palo Alto Networks PA-4000 Series brings new levels of application visibility, control and protection to the enterprise firewall market. Based upon a new traffic classification technology called App-ID™, the PA-4000 Series can accurately identify which applications are flowing across the network, irrespective of protocol, port, SSL encryption or evasive tactic employed. Armed with this in-depth knowledge, security administrators can regain control of their networks at the gateway to achieve the following business benefits: • Mitigate risk through policy-based application usage control and threat detection • Enable growth by embracing web-based applications in a controlled and secure manner • Facilitate efficiency by minimizing the amount of manpower associated with monitoring desktops and removing unwanted applications The PA-4000 Series is a purpose-built, high performance platform with dedicated processing for management, traffic classification and threat mitigation, allowing it to meet the performance demands of protecting a high speed network. The result is a solution that can help mitigate today’s emerging security risks through tighter control of the application traffic traversing the network.

Application Identification At the heart of the PA-4000 Series is an applicationcentric classification technology called App-ID. Unlike traditional security approaches that rely solely on protocol and port, App-ID is an industry first, using up to four traffic classification techniques to analyze the actual session data and identify the application—even those applications that use random ports, tunnel inside and emulate other applications, or use SSL encryption. With the resultant visibility into the actual identity of the application, customers can deploy policy-based application usage control for both inbound and outbound network traffic. The four traffic classification mechanisms in App-ID are: • Application signatures: application context-aware pattern matching designed to look for the unique properties and information exchanges of applications to correctly identify them, regardless of the protocol and port being used. • Application decoding: a powerful engine that continuously decodes application traffic to identify the more evasive applications as well as create the foundation for accurate threat prevention. • SSL decryption: decrypts outbound SSL traffic using a forward SSL proxy to identify and control the traffic inside before re-encrypting it to its destination. • Protocol/port: helps narrow the application identification process, but is primarily used to control which ports applications are allowed to use.

Figure 1: App-ID uses four traffic classification techniques to accurately identify the application, irrespective of port or protocol.


131


The application-centric nature of App-ID means that it can not only identify and control traditional applications like HTTP, FTP, SNMP, but it can also accurately delineate specific instances of IM (AIM, Yahoo!IM, Meeboo, etc), Webmail (Yahoo!Mail, gmail, Hotmail, etc), peer2peer (Bittorrent, emule, Neonet, etc), and other applications commonly found on enterprise networks. Once the application is identified and decoded using App-ID, the traffic can be more tightly controlled through security policies.

User Visibility Through transparent integration with Microsoft’s Active Directory (AD), both ACC and App-Scope will display who is using the application based on their identity from Active Directory, as well as their IP address. Positive identification of which actual user is using specific applications is key to providing visibility into application usage on the network and subsequently being able to create an appropriate security policy that is based on actual users and user groups. In addition to being displayed in ACC and App-Scope, user identity is also accessible as part of the policy editor, logging and reporting giving administrators a consistent view of network activity.

Policy and Configuration Control With increased visibility comes the ability to deploy policies for more granular control over traffic traversing the network. ACC allows a security team to analyze the data collected and make informed, security policy decisions which can then be implemented using the intuitive management interface. From the familiar rule-base editor, an application usage control policy can be created, reviewed and deployed. Administrators can pick and choose from over 500 applications, dynamically updated and listed by their commonly used names. Alternatively, application control can be implemented based on the 16 different application categories, which are dynamically updated as new applications are added the Palo Alto Networks update service.

Conclusion The Palo Alto Networks PA-4000 Series brings welcome relief to security teams struggling to gain control of, and protect the network from new threats borne by the nextgeneration of applications, both personal and business, that are specifically designed to evade today’s port-based security offerings. With its’ fresh from-the-ground-up approach, the Palo Alto Networks PA-4000 Series accurately identifies applications irrespective of the protocol or port that they may use for communications. Once accurately identified, appropriate security policies can be implemented to enforce application usage rights and any traffic that is ultimately allowed onto the network can be inspected more completely for all manner of malware.

Palo Alto Networks 2130 Gold Street Alviso, CA 95002 408.786.0001 www.paloaltonetworks.com Copyright 2007, Palo Alto Networks, Inc. All rights reserved. Palo Alto Networks, the Palo Alto Networks Logo, PAN-OS, FlashMatch, App-ID and Panorama are trademarks of Palo Alto Networks, Inc. in the United States. All specifications are subject to change without notice. Palo Alto Networks assumes no responsibility for any inaccuracies in this document or for any obligation to update information in this document. Palo Alto Networks reserves the right to change, modify, transfer, or otherwise revise this publication without notice.


132


The CIO’s new guide to design of global IT infrastructure:

Five Principles Driving Radical Redesign Technology has enabled businesses to become highly distributed. Headquarters isn’t where all of the action is. With about two-thirds of the workforce operating in locations other than HQ, and an estimated 450 million mobile workers1 around the world, businesses now operate everywhere all the time. The ability to take advantage of business opportunities, people and resources in previously distant markets has created a vast new set of challenges that when taken together, present a difficult dilemma to a CEO or CIO: Continue to deliver acceptable IT services by throwing money, bandwidth and infrastructure at the problem? Or, save money by consolidating at the expense of the end users? Or use IT to drive new business initiatives? Is it possible to do both? How does a business in today’s global marketplace bring the world closer? This paper will explore the business imperatives that are driving enterprise IT design today, and then presents five key principles CIOs are using to redesign business infrastructure at companies of all sizes. Finally, this paper discusses the importance of wide-area data services (WDS) solutions and explains how they can help to cohesively tie together distributed and highly mobile organizations.

The Business Imperative Before considering an information technology strategy for a globalized world, it is valuable to understand the fundamental trends that are pushing businesses to redesign their operations around a small set of broad-based imperatives. 1. Flexibility. Businesses that operate across traditional borders must be able to respond to opportunities and challenges faster than ever before. In order to compete, a business has to be faster to deliver 1 IDC, Worldwide Mobile Worker Population 2005-2009 Forecast and Analysis

a product or service as good, or better, than that of potentially any other company in the world. 2. Simplicity. Less has always been more for enterprises, as an increase in technology has typically led to increased complexity. While per unit costs of technology are always decreasing, in aggregate companies see an increase in cost. Smart CIOs are investing in technologies like continuous data protection, virtualization, and wireless connectivity to help IT slim down its footprint while increasing their business’s competitive advantages. As a result, the IT team is typically in a difficult position, assessing where to try and cut costs while still moving forward with a plan to continually enhance IT services to the business. 3. Security. With the growing importance of digital applications and data, the sources of threats to enterprise data have multiplied dramatically. While businesses do everything that they can to stop threats in the first place, they still must be prepared to recover from threats as quickly as possible. 4. Continuity. As businesses have expanded, the need for anytime, anywhere application access has become a requirement. At the same time, “follow the sun” (global 24/7) operations have shrinking maintenance windows and a need for applications to be running at all times. Delay or loss of data for any reason – system failure, natural disasters – has a domino-like effect across the entire organization, at any time of the day or night.

Redefining the Enterprise Workplace Historically, decision-making power was concentrated in the headquarters. As a result IT infrastructure development


133


mostly focused on that location. Data centers were routinely housed as close to headquarters as possible, where most employees worked, and “remote” workers were often relegated to small, disconnected islands of branch offices. Typically these were sales representatives who only needed to receive information from headquarters. The major decisions – including the tools and data to make those decisions – were essentially in one place. Mobile workers were almost an anomaly – those who were traveling among offices were simply out of touch, with no ability to access applications and data, and few decision making requirements while they were on the road. Today’s enterprise looks significantly different. Location – headquarters, branch office, home office, or no office – simply doesn’t matter anymore. While data lives in a data center, it can be used by anyone everywhere. Decisionmaking has become significantly more decentralized, with mobile workers and branch office employees making critical decisions on a regular basis. The cross functional nature of the distributed workforce significantly changes how a business needs to resource and support both the branch office and the mobile worker. Many organizations have made a 180-degree change in how they view employees who work outside of headquarters: before decisions were handed down to them and now remote workers’ decisions help define the path of the organization, making it impossible for the CIO to develop an IT strategy without accounting for a distributed workforce – and a mobile one as well. In fact, a recent Forrester survey notes that 80% of businesses are trying to set a strategy and policies for mobility in their organizations this year. With such a pressing need for redesigning IT strategies to encompass global, follow-the-sun business practices, how does the CIO begin to sketch out the path forward?

The Principles of the Global Workforce There are five key principles that CIOs must take into account when thinking about how business is changing today.

1. Distance doesn’t matter. Employees now expect to be able to collaborate in real-time with any coworker. They expect to have access to whatever data or services the company offers no matter where they happen to be. IT must be an enabler for the way business needs to operate. Waiting 20 minutes for a file to be sent between workers – even if they are across the world from each other – is no longer acceptable for the employee or for the customer project that they are working on. 2. Business never stops. With a globalized workforce – and a rapidly globalizing customer base – businesses cannot afford their operations to be stopped for even a few minutes. Issues like hurricanes or a flu pandemic might force workers to operate from home for an unspecified period of time. Compromised data centers may require enterprise to rapidly switch operations to secondary locations with no loss in information. 3. Applications and data must be available everywhere but be all in one place. Consolidating data makes it easier to track, protect, and restore. CIOs are demanding that data be brought back from remote offices. At the same time, businesses recognize that the data and applications were “out there” for a reason – that’s where they needed to be accessed. So while consolidation is an important strategy for data protection and cost control, it can negatively impact business operations unless LANlike performance can be maintained everywhere. 4. Knowledge must be harnessed – and data must be managed. Consolidation goes a long way toward eliminating the ‘islands’ of storage and data that evolve over time. But with organizations being required to react quickly in the face of change, or move in order to take advantage of an opportunity, flexibility in moving data and applications is essential. CIOs must be able to quickly move massive amounts of data, and potentially set up application infrastructure in remote locations overnight. New offices and merged/acquired businesses must


134


quickly be absorbed into the fabric of the existing organization by providing them immediate access to new systems and data. 5. There are no second-class enterprise citizens. The days of the “important” people working at corporate HQ are rapidly fading. Employees everywhere are now empowered to make important decisions. CIOs and IT managers may no longer prioritize workers based on their geographic location. Every member of the enterprise needs to have access to the same applications, at the same level of application performance.

Wide-area data services: bringing the distributed enterprise closer No single technology can allow a CIO to accomplish these large goals for an enterprise, regardless of size. Applications, storage, networking, and security will all play into the mix. But no matter which of these technologies – and which vendors – are chosen, one thing remains certain: a widearea data services (WDS) solution is the fabric that can tie all of these elements together. WDS solutions combine the benefits of WAN optimization and application acceleration, to allow workers anywhere to feel as if the files, applications, and data they need are always local. The impacts of WDS are tangible, across a range of different IT projects. WDS solutions allow individuals to collaborate more easily; IT to complete tasks like backup and consolidation more quickly and effectively; virtualized infrastructures to live anywhere and migrate at faster speeds than ever before. WDS solutions are architected – and should be evaluated - with three characteristics in mind: Speed, Scale and Simplicity. Moreover, once IT implements a WDS fabric, the way that they implement services can be dramatically simplified. What if distance were no longer an issue? How would that change the way document management systems, ERP systems, and backup systems are architected? The possibilities are endless. But in order to fully take advantage of WDS in the enterprise, CIOs must choose a solution that can reach

out to the three key areas of the business: the branch office, the mobile worker, and the data center. 1. Branch Office WDS solutions have typically been known for accelerating the branch office Branch office acceleration forms the basis of a WDS fabric. With an effective branch WDS solution, CIOs can engage in meaningful consolidation projects that are de-risked by the fact that application performance will still be maintained. Employees who work in branch offices can more effectively share data with colleagues across the organization, without significant investment in bandwidth or infrastructure. 2. Mobile Worker CIOs today have a strong focus on the mobile worker. It’s those executives, engineers, and sales representatives that are on the move who are often responsible for bringing in new revenue and dealing with the customer in times of crisis. As such, it’s essential that these employees have fast access to any and all of the corporate resources that are available to employees at the office. WDS solutions have a primary role in ensuring that users everywhere can access applications with LAN-like performance even if they are accessing data from low-bandwidth Wi-Fi connections. The introduction of a mobile user use case adds a number of requirements for any proposed WDS solution: Does the mobile solution provide acceleration of the same level to mobile workers as to branch offices? Is the WDS solution architected so that the Mobile accelerator connects directly to the existing appliance solution? Can the appliances support potentially thousands of mobile workers effectively? Does the mobile software use the same code base and functionality as the appliance solution? IT-empowered mobile workers can also enable new and innovative work arrangements within an organization. For example, businesses that are hoping to expand to a new region often want to hire professionals in that region. Once in place, the workers can “source” work from other offices, collaborating


135


in real-time with colleagues on projects in other parts of the world. 3. Data Center The idea of application acceleration has a special place in the data center. Of course it must tie in to what is happening in the branch office, but a different set of challenges await with the sometimes massive amount of data that needs to be managed among data centers. Massive backup and replication jobs are now a regular occurrence. Datacenter migration for storage and applications, moving virtual images of servers, and snapshots are becoming essential. As a result, many companies are regularly trying to move terabytes each day, in a window that is continually shrinking to support 24x7 operations. These requirements in themselves require a WDS solution that can scale up to handle massive data transfers, and also be clustered to handle the simultaneous load of inter datacenter transfers as well as datacenter-to-branch transfers. Large-scale solutions between datacenters need to be able to handle different bandwidth conditions as well: high bandwidth connections; high latency between DR

centers. These conditions, plus those encountered in a branch office environment, and those of a mobile user comprise a wide set of conditions that require an intelligent, adaptable WDS solution.

Conclusion The way that businesses operate is always changing. CIOs must be prepared to adapt their IT infrastructure in a way that supports distributed employees, anytime anywhere collaboration, and the need for business continuity in times of change or disaster. Using WDS solutions, CIOs now have a technology that can tie together their distributed enterprises. Mobile and branch office workers can have the same level of application performance as users at headquarters. Data centers are more protected, using WDS to respond faster in the event of disaster. Infrastructure can be consolidated without performance loss to far-off locations, yet the flexibility to move data and applications can be retained, often providing faster response than ever before. With WDS, CIOs now have a way to bring their distributed enterprise closer together.

www.riverbed.com


136

ApplicAtion Delivery HAnDbook | februAry 2008

Shunra

Virtual Enterprise WAN emulation solutions for the entire application development lifecycle.

Shunra Virtual Enterprise Desktop

Shunra Virtual Enterprise Suite

Now you can quickly test your applications to ensure production network readiness. Simulate any wide area network link to test applications under a variety of current and potential network conditions – directly from the desktop. A windows-based product, Shunra Virtual Enterprise (VE) Desktop is a must have tool for application developers and QA teams.

Shunra Virtual Enterprise (VE) is a highly robust, comprehensive network emulation solution that creates a virtual network environment in your performance and pre-deployment network lab. It delivers a powerful and accurate way to model complex network topologies and test the performance of your applications and network equipment under a wide variety of network impairments – exactly as if they were running in a real production environment. With this tool you can uncover and resolve production related problems – before rollout. Use production network behavior recorded with Shunra VE Network Catcher to capture real world testing conditions in the lab.

Learn more at www.shunra.com

Go to www.shunra.com for a free 30-day trial


137

ApplicAtion Delivery HAnDbook | februAry 2008

Virtual Enterprise Desktop Professional v4.0 Now it’s easier than ever for multi-user enterprise network engineers and developers to test applications under a variety of current and potential network conditions— directly from the desktop. Shunra Virtual Enterprise (VE) Desktop Professional offers a methodology that integrates test plans, including business processes and network scenario files. The enhanced VE Desktop Server enables the administrator to plan, prepare, manage and distribute tests and reports. Additionally, it allows end-users to conduct tests based on these plans, experience the impact of impairments on transaction times, and upload test results to the VE Desktop Server.

Enhanced Functionality for Administrators Silent Install: allows enterprise deployment and installation by network administrators withoutend-user interaction Vista Support: all end-user desktops running Windows Vista are fully supported

Redesigned user interface Redesigned Basic Mode GUI: enhanced and more user friendly options deliver a much easier way of defining testing parameters. Parameters include: traffic source and destination (by cities), connection quality and connection type.

Increased Testing Value Agent Support: provides an Application Programming Interface (API) which enables clients to be seamlessly integrated with automation and load tools Transaction Analysis: Transaction Timer: lists contents of business processes to guide users in conducting tests. Enables timing of transactions and designation of pass or fail of transactions. Packet List: enables the capture of packets during tests. Packet List saves data in .enc format and can then be used by VE Analyzer or other packet analysis software VE Analyzer Integration: tests conducted using a Packet List to capture traffic can be analyzed using Shunra VE Analyzer or other packet analysis software Business Process Editor: enables the creation of business processes and the subsequent transactions which take place for each process Test Results Repository: a central repository for test results is available in the VE Desktop Server. Test results that include HTML Reports, packet capture files (.enc) and Shunra .vet test result files can be uploaded to the central repository for viewing and distribution Enhanced Professional Mode: integrates the use of Test Plans that include Business Processes (list of transactions that need to be conducted) and Network Scenario Files (impairment parameters) to guide users in conducting tests VE Reporter: VE Desktop Client is installed with a basic version of VE Reporter that displays runtime and offline data in easy to read reports. Reports can also be saved in HTML and MS Word format, and test results can be exported to Comma Separated Values (.csv) Test Data Upload to VE Desktop Server: test results can be uploaded from the VE Desktop Client to the VE Desktop Server central repository


138


Turn-key Managed Services Deliver Immediate Results. Now, you choose how Shunra Virtual Enterprise is delivered! Let’s face it, the need for our solutions doesn’t always correspond neatly with a budget cycle. Flexibility to receive the value of our technology the way you require it is now more essential then ever.

The Dilemma Oftentimes our customers need to quickly add one or more VE appliances or software solutions to manage surges in their demand for testing projects. Other customers test applications prior to deployment using ad-hoc teams assembled temporarily to quickly certify an application’s readiness for production. Finally, many organizations that are looking at our technology for the first time are between budget cycles and don’t have the funds for a capital purchase, or the time to process the request through internal channels.

The Solution Shunra has responded by creating a Services team dedicated to delivering the benefits of our solutions when you require them. If you need the power of Virtual Enterprise to emulate and test your applications and networks on a temporary basis we will deliver both the technology

and a skilled technician to work with your staff. The expenditure includes the onsite expertise as well as the short-term use of the Virtual Enterprise suite.

The Benefit When the project is completed the Shunra team exits, technology in hand, until future needs arise. No capital purchase, no learning curve, no more failed application rollouts.

Delivering • Detailed, Actionable Reports and Analysis • Best Practices and Documented Methodology • Repository of Project Artifacts • Expert Knowledge Transfer, Mentoring, and Training

\\ WAN Optimization Vendor Selection (with our ROI Metric) \\ Application Performance Profiling/Production Readiness \\ Package or Custom Application Roll-outs \\ Data Center Moves/Server Consolidations

CONTACT SHUNRA VIRTUAL ENTERPRISE TODAY TO LEARN MORE! North America +1 877 474 8672 Europe and Africa +44 208 382 3757 Israel, APAC, Mediterranean +972 9 763 4228 www.shunra.com


139


High Performance Web 2.0 Applications How enterprises can have their performance and new features too by automatically optimizing Microsoft ASP.NET applications in the network A New Performance Challenge IT staff supporting today’s rich, highly interactive and dynamic web applications are facing new and difficult performance challenges. Web 1.0 was characterized by web sites designed to disperse information, one-way communication vehicles with largely static HTML pages ,files, that although large, such as image and video, changed rarely and were displayed exactly the same way each time a visitor requested the page. Today’s Web 2.0 sites are full blown applications, characterized by dynamic pages stitched together from database lookups and web services whose content is regenerated every time a visitor reloads the page. Additional tools, such as AJAX and Microsoft’s Silverlight let developers deliver a more complex and “desktop-like” user experience.

Communication is two-way Content generation has shifted from the majority of the content built by the website owner the bulk of the content contributed by users. The dynamic web has transformed into a medium for sharing and collaboration, embracing a new approach to generating and distributing Web content. And as these new sites, really web applications, add APIs, the communication extends to application-to-application communication.

Traditional solutions aren’t enough Ensuring performance and scalability for these new web applications requires new tools that can support dynamic content and adapt to changing usage patterns. Traditional solutions – adding servers or more network capacity and deploying traditional application acceleration devices – provide some gain, but dynamic web 2.0 applications have inherent bottlenecks that cannot be removed by increasing server or network capacity. Traditional acceleration devices offload the server for static applications, but they lack the dynamic features needed to fully exploit browser caching, and the visibility of the application and the data lookups to offload work for dynamic applications.

Automatically optimize Microsoft ASP.NET If you run web applications in a Microsoft environment, chances are they were developed using Microsoft’s ASP. NET framework which includes many features designed to support application performance and scalability, ranging from caching to session state. While these features can be used to improve application performance and scalability, taking advantage of them through coding can be complex, time-consuming and expensive forcing your development organization or software vendor to tradeoff new features for

Pages are dynamic With this change from static web sites to dynamic web applications come new and difficult performance challenges. Traffic volumes and patterns become unpredictable and can spike without warning. Applications designed using development methods to deliver and deployed with infrastructure to support an average load can experience order-of-magnitude traffic increases overnight.


140


improved performance. Delivering high performance Web 2.0 applications demands a new solution that takes advantage of these ASP.NET performance features, but does it automatically, optimizing ASP.NET application in real time, in the network.

Strangeloop Accelerates Web 2.0 Applications The Strangeloop AS1000 Application Scaling appliance takes a new approach to application acceleration, automatically optimizing ASP.NET applications by implementing a series of performance optimizations in real time, in the network. The Strangeloop AS1000 takes over the difficult task of performance and scaling optimization, removing that burden from the developers. It gives network managers a way to meet performance challenges without adding more servers or handing the problem back to development

Strangeloop AS1000 Appliance The Strangeloop AS1000 is a network appliance, easily deployed in line with the web servers, behind the load balancer. In this position, the AS1000 sees every request/ response pair moving between the web servers and client browsers. By monitoring these pairs, the AS1000 can identify opportunities to optimize the ASP.NET application and implement specific optimization treatments as appropriate. The Strangeloop AS1000 relies on three key architectural components: the Choreography Engine, the Choreography Agents, and a series of Application Acceleration Treatments.

Choreography Engine The Choreography Engine is the AS1000 appliance’s software platform and provides a number of features necessary to monitor ASP.NET traffic, act on appropriate data, and maintain the system, including: • A high-performance network cache for storing data relevant to the request/ response cycle of the ASP. NET application • Monitoring features for tracking request/response pairs—to measure performance and provide the data for optimization discovery

• Application-based cache expiry features that ensure that caches remain current

Choreography Agents Choreography Agents, in the form of ASP.NET Providers, communicate with the AS1000 to provide visibility into the application and provide additional feedback for monitoring and improving performance.

Application Acceleration Treatments Application acceleration treatments are the heart of the Strangeloop AS1000 system. Treatments are implementations of coding optimization techniques. When a treatment is enabled on the AS1000 appliance and applied to a web application, it “treats” ASP.NET traffic to provide the real-time code optimization that yields performance and scalability improvements. The use of the treatments creates ASP.NET web pages similar in logical structure to the hand-coded, more time-consuming manual implementations of optimized ASP.NET web pages. Yet these optimizations can be added and removed on demand, in real time. The Strangeloop AS1000 appliance provides a series of treatments that provide real-time optimization to improve the ASP.NET application’s performance and scalability, including: Automatic Output cache—Caches server output on the AS1000 to reduce server workload and accelerate page views Automatic Web Services cache—Caches web services server output on the AS1000 to reduce server workload and accelerate page views Dynamic Browser cache—makes caching resources such as images and JavaScript files on the browser a viable option by adding the ability to automatically learn when those files have changed and dynamically renaming them. Browser caching can reduce the number of round trips required to render a page by as much as 10 times; this also significantly reduces the amount of bandwidth used, Automated ViewState Removal—Removes ViewState from the outgoing response and re-inserts it into the returning requests, reducing bandwidth requirements and accelerating page views.


141


Deployment and Performance Management The Strangeloop AS1000 Appliance is designed for simple, rapid installation and configuration. The Strangeloop Manager provides an easy to use GUI that walks users through setting up the appliance and configuring the application treatments to accelerate any ASP.NET application. Treatments can be applied on per application or per URL basis.

Strangeloop Manager

"With ASP.NET and Strangeloop, organizations can quickly deliver richer user experiences featuring increased performance and improved bandwidth utilization." - Keith Smith, Developer Tools Team, Microsoft Corp

Conclusion ASP.NET sites that serve more than a few dozen simultaneous users or perform complex tasks inevitably face performance challenges. Using the Strangeloop AS1000 appliance, you can avoid the tradeoff between new features and performance. The Strangeloop AS1000 appliance takes over the difficult task of performance and scaling optimization, removing that burden from the developers. It gives network managers a way to meet performance challenges, without adding more gear or handing the problem back to development.

The Strangeloop Manager also provides access to Strangeloop Analytics Reporting features.

Strangeloop Analytics: Browser Load Time

Development can focus on new features, and can still use the time- and effort-saving features of the ASP.NET Framework and third-party controls that make it easier to bring new features and applications to market faster. The Strangeloop AS1000 appliance reduces the need for time-consuming performance optimization, and reduces the ongoing cost of delivering and supporting ASP.NET applications. And the AS1000’s deep understanding of ASP.NET behavior allows for dynamic tuning as the application evolves. With the AS1000 in your network, you can keep up with the dynamic nature of today’s Web 2.0 applications, in a way that’s simply not possible when you rely on traditional application acceleration appliances or handcoding for performance.

http://www.strangeloopnetworks.com/


142

Application Delivery Handbook

Recommend Documents