SAP in the Cloud - Security Essentials

ANTHONY PETTA MARY HUNT MICHAEL PATRICK DANIELS

SAP IN THE CLOUD

SECURITY ESSENTIALS Best practices to maintain a secure environment for your SAP® solutions in the cloud

insiderBOOKS is a line of interactive eBooks from the publisher of SAPinsider. With insiderBOOKS, you can evolve your learning and gain access to the very latest SAP educational content. The team at insiderBOOKS has created a brand-new solution for the digital age of learning. The insiderBOOKS platform offers affordable (and, in many cases, free!) eBooks that are updated and released chapter by chapter to ensure that you’re always reading the freshest content about the latest SAP technologies. Each eBook provides more than what you can get from the typical textbook—including high-res images, cloud-based material (access your content anywhere!), videos, media, and much more. Plus, thanks to our generous sponsors, you get the opportunity to qualify for free eBooks. insiderBOOKS is Reading Redefined. Please visit our website for additional information: www.insider-books.com

2

Anthony Petta, Mary Hunt, and Michael Patrick Daniels

SAP in the Cloud: Security Essentials

From the publisher of SAPinsider

3

SAP in the Cloud: Security Essentials

by Anthony Petta, Mary Hunt, and Michael Patrick Daniels Copyright © 2016 Wellesley Information Services. All rights reserved. No portion of this book may be reproduced without written consent. Printed in the United States of America. First edition: December 2016. ISBN: 978-0-9974291-3-8

Published by Wellesley Information Services, LLC (WIS), 20 Carematrix Drive, Dedham, MA, USA 02026. Publisher Product Director Managing Editor Acquisitions Editor Editor Copyeditor Art Director, Cover Designer Production Artist Production Director

Melanie A. Obeid Jon Kent John Palmer Jawahara Saidullah Andrea Haynes Nicole D'Angelo Jill Myers Kelly Eary Randi Swartz

Although Wellesley Information Services uses reasonable care to produce insiderBOOKS, we cannot assume any liability for this book’s contents. In no event will Wellesley Information Services be liable for costs or damages, direct or indirect, including special, remote, or consequential damages, even if Wellesley Information Services has been advised of the possibility of such damages, from any cause, whether in contract, tort, or strict liability.

SAP, SAP ERP, SAP HANA, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. All other product and service names mentioned are the trademarks of their respective holders. Wellesley Information Services is neither owned nor controlled by SAP SE or any of the SAP SE group of companies. insiderBOOKS are published independently of SAP SE.

4

About the Authors Tony Petta is the ISO Certification Program Manager and Compliance Team Offering Lead for IBM’s Cloud Managed Services. Previously, he has held positions as a Security Consultant and Internal Auditor as well as selling security services and functioning in a sales support role as part of his 19 years of experience in IT. Mary Hunt has been a Cloud Technical Business Development Executive with IBM since 2012 with a focus on SAP and SoftLayer. In this role, she supported IBM Sales with her SAP technical knowledge, working both as a salesperson and the sales engineer. She has executed deals up to $6 million in revenue for IBM Cloud for SAP Applications. Previous to moving over to IBM Cloud Sales organization, she led technical teams for IBM in implementations of SAP for financial and industrial customers. She is a certified Technical Seller with IBM in SoftLayer and ERP in the cloud, and is also a certified SAP Technical Consultant. Michael P. Daniels is an Executive Security Architect with IBM's Cloud Managed Services. Mr. Daniels is a services-focused security professional with an extensive background in Security and Compliance Design, Architecture and Operations of IBM Cloud Platforms and Services since 2007. Mr. Daniels has held security positions throughout IBM in his 17 years at IBM, including in IBM Global Technology Service, IBM Software Group and the IBM Cloud Advanced Innovation Labs.

5

FOREWORD

About This Book This book, “SAP in the Cloud: Security Essentials,” focuses on security topics associated with cloud computing within the SAP space, with an emphasis on IBM offerings. This book covers areas such as security standards, network security, firewalls, infrastructure security, VLAN, access, virtual machine security, hypervisor, equipment redundancy, encryption, ID management, single sign-on, risk assessments, reporting, backups/restores, and destruction of media and hardware.

6

Table of Contents Chapter 1

On Premise vs. The Cloud........................... 13

Types of Clouds........................................................................................................... 16 Public Clouds......................................................................................................17 Private Clouds.....................................................................................................18 Hybrid Clouds.....................................................................................................19 Cloud Computing Layers........................................................................................... 19 Moving Your SAP Solution to the Cloud................................................................. 20 Security Challenges in the Cloud.............................................................................. 22 Conclusion................................................................................................................... 24

Chapter 2 Risk Management and Security Standards.....................................................25 A Risk Management Approach.................................................................................. 26 Identifying Risks.......................................................................................................... 27 Security Standards....................................................................................................... 32 SOC 1 ..................................................................................................................32 SOC 2 ..................................................................................................................33 SOC 3 ..................................................................................................................34 ISO .......................................................................................................................34 Regulatory Frameworks............................................................................................. 35 PCI DSS ..............................................................................................................36 HIPAA and HITECH ........................................................................................36 European Union (EU) Data Protection Directives .......................................37 Other Governmental Regulations and Standards..........................................38 Conclusion................................................................................................................... 38

7

Chapter 3

Physical Security........................................ 39

The OSI Network Model............................................................................................. 39 Physical Threats........................................................................................................... 41 Disaster Protection...................................................................................................... 42 Paint......................................................................................................................42 Pipe.......................................................................................................................45 Power....................................................................................................................45 Business Continuity.................................................................................................... 46 Building Access............................................................................................................ 48 Perimeter Defenses.............................................................................................48 Getting Inside......................................................................................................49 Human Resources....................................................................................................... 50 Hiring and Onboarding.....................................................................................51 Training................................................................................................................51 Exit Procedures...................................................................................................52 Conclusion................................................................................................................... 53

Chapter 4 Network Security.........................................55 Threats and Vulnerabilities........................................................................................ 56 Considering Network Risks....................................................................................... 57 Going on Defense........................................................................................................ 58 Creating a Defensive Infrastructure.......................................................................... 60 Firewalls........................................................................................................................ 61 Network Segmentation............................................................................................... 62 Security Zoning........................................................................................................... 64 Identifying Threats...................................................................................................... 66 Your Defensive Responsibilities................................................................................ 67 How Defenses Prevent Threats.................................................................................. 68 Hackers and Cyber Attacks...............................................................................68 Malware...............................................................................................................69 Denial of Service Attacks...................................................................................70 Misconfigurations...............................................................................................71 Injection...............................................................................................................71 Conclusion................................................................................................................... 72

8

Chapter 5

Hypervisor Security....................................73

Types of Virtualization............................................................................................... 74 How Hypervisors Work.............................................................................................. 76 Threats 78 Isolation........................................................................................................................ 79 Introspection ............................................................................................................... 81 Controlling Resource Usage....................................................................................... 82 SAP Tools for Resource Management.............................................................84 How Hypervisors Prevent Threats............................................................................ 85 VM Escape...........................................................................................................86 Breaking Isolation...............................................................................................87 Resource Starvation............................................................................................88 Accessing Hypervisor Interfaces......................................................................88 Device Emulation...............................................................................................89 Conclusion................................................................................................................... 90

Chapter 6 Encryption.................................................... 91 What Is Encryption?................................................................................................... 91 Encrypting Data at Rest.............................................................................................. 94 Advanced Encryption Standard (AES)............................................................95 RSA.......................................................................................................................96 Encrypting Data in Motion........................................................................................ 97 SSL/TLS................................................................................................................98 IPsec...................................................................................................................100 Key Management....................................................................................................... 101 Encryption Within SAP............................................................................................ 103 Implementing Additional Encryption.................................................................... 106 Conclusion................................................................................................................. 108

9

Chapter 7

User Access Controls................................109

Users and Roles in SAP............................................................................................. 110 Privileges............................................................................................................110 Roles...................................................................................................................110 Users...................................................................................................................114 User Interfaces to SAP.............................................................................................. 115 SAP GUI............................................................................................................115 SAP Business Client for Desktop....................................................................116 SAP Fiori Launchpad.......................................................................................116 SAP Enterprise Portal......................................................................................117 Authentication........................................................................................................... 118 User and Password...........................................................................................119 Single Sign On...................................................................................................122 Two-Factor Authentication.............................................................................126 Auditing User Actions.............................................................................................. 127 Audit Policy.......................................................................................................128 Using Audit Trails.............................................................................................129 Governance, Risk, and Compliance (GRC)........................................................... 130 Conclusion................................................................................................................. 131

Chapter 8

Software Updates .....................................133

Patches 134 Identify 135 Test 136 Implement.................................................................................................................. 140 Conclusion................................................................................................................. 142

Chapter 9 Data Destruction.......................................143 Overwriting................................................................................................................ 144 Degaussing................................................................................................................. 146 Destroying equipment.............................................................................................. 147 Backup Media............................................................................................................ 150 Conclusion................................................................................................................. 150

10

Chapter 10 Information Security Management......... 151 Information Security Management Systems.......................................................... 151 Physical and Environmental Security..................................................................... 153 Access Control........................................................................................................... 154 Change Management................................................................................................ 155 Incident Management............................................................................................... 157 Vulnerability Management....................................................................................... 158 Communications or Network Security.................................................................. 159 Encryption.................................................................................................................. 160 Asset Management.................................................................................................... 161 System Maintenance and Development................................................................. 162 Human Resource Security........................................................................................ 163 Compliance................................................................................................................ 164 Third-Party Vendor Relationships.......................................................................... 165 Mobile Security.......................................................................................................... 165 Disaster Recovery and Business Continuity Management.................................. 166 Security Organization............................................................................................... 166 Security Policy Management................................................................................... 167 Conclusion................................................................................................................. 167

Chapter 11 Risk Management......................................169 Identifying Risk......................................................................................................... 169 Assessing Risk............................................................................................................ 171 Managing Risk........................................................................................................... 174 Conclusion................................................................................................................. 177

11

Chapter 12 Complementary Services......................... 179 Provider Optional/Third-Party Solutions.............................................................. 180 Preemptive Security.........................................................................................181 Authentication..................................................................................................182 Data Traffic Controls........................................................................................183 Event Detection and Response.......................................................................184 Disaster Recovery Services..............................................................................185 Third-Party Risks....................................................................................................... 187 Conclusion................................................................................................................. 189

12

CHAPTER 1

CHAPTER 1

On Premise vs. The Cloud Before we get into what goes into cloud security, we need to have a basic understanding of what a cloud deployment means to your IT environment. We’ll talk about the types of clouds and the layers of cloud computing that you can use, then finish with a discussion of what an SAP system can look like in a cloud environment and some of the threats that the system can face. By now, you’ve almost certainly been pitched the benefits of “the cloud.” Software providers everywhere throw around terms such as cloud-enabled, cloud storage, and cloud-powered. These terms are used pretty liberally, so before we dive into cloud security, let’s start by defining what we mean by the word “cloud.” Note: This chapter contains a shortened discussion of what the cloud is. For a more detailed overview, see the first book in this series, SAP in the Cloud: An Executive Guide.

The term “cloud” usually refers to computing resources consumed in a pay-as-you-go service model instead of as a capital expense. The actual computing happens on a virtualized computer running on networked hardware. The virtual computer has access to a flexible pool of computing resources that can grow or shrink as needed. Cloud environments can exist as part of your wide-area network (WAN) or via a network connection, usually one secured with virtual private networks (VPN) or multiprotocol label switching (MPLS) technologies to make it private to your organization. 13

CHAPTER 1

1 | On Premise vs. The Cloud

When you compare an on-premise enterprise system to a cloud deployment, the primary difference is that most on-premise systems end up being single-purpose hardware solutions located in your organization’s data center. You determine the requirements that you’ll need for whatever application you’re using, purchase hardware with enough capacity for a couple of years based on your best guess, and turn that over to your IT guys to set up for your application. You get full control over your environment when your system is on your organization’s premises. You manage all security controls, restrict who has access to the physical machine, and determine all data preservation and recovery policies. For organizations with mature security policies, this can be a benefit. However, if you don’t have your security on par with international standards, this can be a huge negative. The price of that control can be steep. You just made a huge capital expenditure for a machine with fixed resources. You may have to deal with depreciation for the next few years. Plus, depending on your internal procurement and build procedures, acquisition and setup could take a while. If you need a solution now, you might be out of luck. A cloud system gets you up and running faster for a smaller initial investment. Your provider assumes the responsibility for physical security, making sure enough resources are available, and maintenance. A cloud deployment takes advantage of economies of scale and flexibility not available in a pure hardware solution in that you can provision resources to your application as needed. You pay a monthly service charge for the amount of computing resources that you use and let somebody else account for depreciation. Calling it “the cloud” can cause some confusion. A cloud is a shifting, vaguely defined form far away in the sky. Its exact location and scope are unclear and in flux. In contrast, the resources that make up a computing cloud are precise and known. They can be shifted around to serve different applications as needed, but the resources themselves are part of physical hardware, sitting in a data warehouse somewhere. You can even create a cloud system using hardware that resides on premise, so the distinction between cloud and on premise isn’t a hard line. You can even have private on-premise cloud environments within your data center. These terms aren’t mutually exclusive; cloud refers to an IT deployment, and on premise refers to the location of the server

14

machines. Most people think of an on-premise system as a single-purpose mainframe, but you can virtualize its resources to host multiple virtual machines. At IBM, our cloud platforms can use resources from data centers located all around the world, each with a set of server racks built to the same specifications (Figure 1.1). When you plug into this network, you can provision resources from any data center around the world, depending on which provides you the lowest latency. Or you can buy server space in a specific data center to use as you see fit.

Figure 1.1 IBM SoftLayer data centers

Note: A virtualized computer runs on software called a hypervisor. Back when IBM first developed virtualization technology in the 1960s and 1970s, mainframe operating systems ran on software called a supervisor. Virtualization ran multiple supervisors on a super-supervisor. What’s more super than super? Hyper! So that software that managed multiple virtualized supervisors became known as a hypervisor.

Much of the security challenges for cloud deployments come from their network connectivity. Numerous lines of data go in and out. Many of those challenges can apply to on-premise systems, but with on-premise systems, you can personally control what networks it connects to and how 15

CHAPTER 1

On Premise vs. The Cloud | 1

CHAPTER 1


they defend against intrusions. Many cloud systems, especially hosted solutions, take away some of the security controls you have. You can’t control physical access to the hardware or manage a disaster recovery policy.

Types of Clouds When you put your IT environment “in the cloud,” it can mean multiple things. Cloud environments come in three forms: • Public: A set of shared resources that can be spread across any geographical area. Multiple cloud users could be sharing the same physical hardware. • Private: The physical hardware is not shared, but all computing resources are still automatically orchestrated and provisioned. You could have multiple applications running on your own cloud server while protecting it from potential data bleed from other users, especially if you host the cloud environment on premise. • Hybrid: A partial cloud implementation that combines multiple deployment models. For example, you could have an on-premise system that uses cloud resources as a supplement in times of heavier workloads. Or you could split an application to access a more flexible storage solution while keeping all the application data in house. Each of these have similar security challenges, though depending on the points of exposure, the scope and scale of that challenge can differ. We find that the vast majority of users exist in some sort of hybrid environment, sometimes without even realizing it. We’ll discuss that in more detail as we cover these three deployments. All cloud systems reduce the amount of control you have over the IT environment. This could be good or bad. You may not have transparency on the host environment, on its patch management, its access controls, its network security, and so on. You may not know what kind of physical controls or policies it has in place, who it lets into the server room, or where your poor, defenseless data sits on a tape drive on a shelf. It may not have very good fire suppression policies, network access redundancy, or disaster recovery plans.

16

When you pick a cloud provider, you need to trust it as well as you’d trust your own IT department. You can find providers who will give you the transparency you need and who will detail exactly what their security policies are in regards to all the questions above. Because cloud providers specialize in that environment and its challenges, you might find one that has better security than you do. Let’s talk about the various cloud types and the security challenges they present. Public Clouds Public clouds provide access to shared resources over public or private network connections. Customers share resources on the servers on a metered basis. You pay for the infrastructure, hardware, and software that you use, often in small increments. This could mean multiple customers have virtual machines running on the same physical hardware. With your SAP solution, you may have some systems hosted in public clouds. For example, you may run non-production systems in an off-premise public cloud solution while keeping your production system in an on-premise private cloud. You could host the production system in the cloud as well, though that gives up your control over security and resources. The first commercially-available cloud servers were public clouds. IBM invented virtualization in the 1960s and 1970s, but it didn’t have the networking infrastructure to take advantage of it. For example, Amazon found that for three months before Christmas, it had a huge spike in traffic. It needed lots of extra server power to handle it, but just for those three months. The rest of the year, those servers lay dormant. By creating virtual machines on those servers and renting them to the public, Amazon was the first to offer what we now call public cloud computing. You may have a battle-tested security policy in your IT department using international standards. However, once you introduce a public cloud into the mix, you rely on someone else’s policies and procedures. Depending on the public cloud provider you select, you may not be able to monitor physical access to the hardware or apply your internal policies. It’s possible that you’ll be putting your data in an environment run by an IT department with less expertise than yours. Your SAP data is the lifeblood of your organization; you need a cloud provider who can secure it. 17

CHAPTER 1


CHAPTER 1


You also open yourself up to multitenancy issues. Because you and other cloud customers will likely use the same hardware resources, there is the possibility, however slight, that the data could leak into other virtualizations. Researchers have found whole new classes of exploits with shared spaces, including hackers reading data from what was supposed to be a new storage space and taking over other customers’ computing resources just by guessing their IP or MAC addresses. Private Clouds Traditionally, private clouds operate on hardware owned or leased by a single organization. Whether it’s managed by the owner as what’s called a bare metal server or by a third party, the physical infrastructure can be hosted either on premise or at an external location. Not all private clouds exist on their own hardware; there are shared private clouds, in which multiple virtualized computers exist on the same hardware but have a clear, logical separation. They’ll never access the same portions of the hardware—the same memory addresses, the same processors, the same disk drive blocks. Private clouds have similar issues to any other networked computer, plus the issues that come from using a hypervisor and virtualization, which we’ll talk about later. But one of the more surprising risks with private clouds draws from the strengths of cloud environments. Because it’s so easy to get an application up and running and provision the necessary resources, almost anyone can do it. Unless you have strict controls, users could bypass the typical IT procurement and deployment processes. You can scale up the resources available to an application at the touch of a button. Sometimes it even happens automatically. This can lead to resource sprawl. Suddenly, the application that was only supposed to take up part of your private cloud is now monopolizing it. Within your enterprise-based systems, capital expenditures, chargebacks, and other costs are well defined and can limit resource sprawl. Within a private cloud, you might not have such clear checks and balances, so resources can be easy to occupy.

18

Hybrid Clouds Hybrid clouds combine standard hardware with some sort of cloud environment. Your enterprise system may be sending transactions over to a private cloud server. Your on-premise system may use public cloud resources to handle computing overflow during peak usage without having to expend the capital needed to build a system big enough for maximum overload. Or maybe you found a really great web application that you want to plug into your SAP implementation. You could even have a private cloud environment that uses public clouds for workload overflow. Those are all examples of hybrid clouds. The hybrid approach, while it does have its conveniences, effectively doubles your potential issues. You have all the problems of your traditional enterprise install, maintaining and protecting it through your crack IT squad. But then you add all the problems of accessing a cloud environment. Any of the challenges of the previous two cloud types could apply to a hybrid cloud. Hybrid clouds have transitive risks—the risk that they accept becomes your risk. Their threats become your threats, and you may not have the same level of control over the defenses.

Cloud Computing Layers There are three layers of cloud computing that build on each other. When someone talks about using the cloud, they mean one of the following layers: Infrastructure-as-a-service (IaaS): Your computing infrastructure— the memory, the processing power, the disk storage, and other computing hardware—is provided as needed to the applications using them. Your computer becomes virtualized on the server, and a hypervisor emulates the physical hardware of a computer. Through this hypervisor, resources can be automatically increased or decreased depending on the needs of the software running on the virtualized computer. That computer can take up a small fraction of a single physical machine, or it can have its processing muscle distributed across multiple servers. Platform-as-a-service (PaaS): The development platform—the operating system, programming language, computing resources, run-time environment, and source control—all become part of a pay-as-you-go service. This takes a lot of the sting out of startup costs and allows rapid 19

CHAPTER 1


CHAPTER 1


development of new custom solutions without having to build out a ton of infrastructure. At IBM, we provide a fully-hosted SAP solution as a PaaS. We support all aspects of your SAP system and its databases, from the technical and implementation levels through business process validation. Any personalization or customization becomes your responsibility. Software-as-a-service (SaaS): Your application—email reader, word processor, or video library—becomes a web service to which you navigate your browser when needed instead of an install package you load onto your local computer. SaaS applications allow consistent experiences across multiple machines and multiple users. Traditionally, local software can be replaced by a cloud-enabled SaaS version, such as Microsoft Office 365 and Adobe Creative Cloud, which provide the same software through an installed client, but manage updates and access through a cloud service. Notice a theme here? The cloud turns what were capital expenditures into services. It’s a flexible pool of resources accessible by a network connection that can be quickly provisioned or released, depending on the needs of the application running on top. The cloud server can allocate these resources in either an autonomous manner as needed or through a manually managed interface. All these cloud layers can simplify your operation by removing hardware from your balance sheet, accelerating deployment of new builds, and expanding or contracting to fit your current needs. Someone else manages traditional IT services, such as hardware configuration, software updates, and other local maintenance. With a cloud solution, you get up and running quickly without the burden of up-front costs and complexity.

Moving Your SAP Solution to the Cloud When you decide to move all or parts of your SAP solution to a cloudbased environment, you gain a lot of advantages. You gain a more flexible computing resource that scales in a virtually unlimited way to meet your business needs. For a business critical application such as SAP, the time it takes to implement new application enhancements could go from the days or weeks that it takes to get hardware up and running to a couple of hours to provision the resources for a new virtualized environment. SAP understands the power of the cloud, so it has been continually introducing products that take advantage of it. It has specific versions of

20

SAP HANA—SAP HANA Cloud Platform and SAP HANA Enterprise Cloud—that are engineered from the ground up to run on cloud environments. To support this, it has a growing SAP Cloud Appliance Library that holds pre-configured SAP solutions. This not only simplifies moving your SAP infrastructure to the cloud; it makes testing and configuring the system easier once you’re there. For the cloud infrastructure itself, SAP introduced a partner managed cloud, which allows existing trusted SAP partners to provide full implementations of the software on their own cloud services. This gives you cloud hosting from an SAP partner, someone who knows how to maximize your SAP experience. It usually includes full support, implementation and migration help, and even additional management software. Note: As an SAP partner, IBM provides integrated implementation and hosting services on both SoftLayer and Cloud Managed Services (CMS). We have more than 30 years of experience with both SAP and cloud hosting. Of course, you should shop around to find the best fit for your business needs, but you can’t beat our experience.

When considering your cloud environment, whether you outsource it or set it up in house, there are a few questions to ask of potential providers. These questions apply no matter what type of cloud you use, no matter where it’s located. • How can they guarantee performance? Your service level agreement will probably guarantee something like 99.9% uptime. But that number becomes meaningless if the system runs slowly. Find out where their data centers are located, what kind of network cables they have going in and out, and how they manage “noisy neighbors” who hog resources and bandwidth. • Who’s in charge of what? That is, who is responsible for security, for updates, for making sure everything runs smoothly? You may want more control, so the more access you get the better. However, there will always be shared responsibilities. Find out what they are before you sign any contracts. 21

CHAPTER 1


CHAPTER 1


• How does migration work? Migrating your SAP system can be a complex undertaking. It can be easier when your cloud provider has experience with the process. You may need additional handholding to manage the data transfer, configuration, and system stress that the process involves. • What kind of reporting do I get? Depending on your provider, they may give you detailed uptime, performance, and usage reports, or they may give you overviews. You may need real-time visibility. But then again, real-time, in-depth reports may be too much. Negotiate exactly what sort of information you have access to and what they keep for themselves. And, of course, you need to find out about what kind of security policies they have.

Security Challenges in the Cloud Some of the threats and vulnerabilities that an SAP solution has in the cloud come from its strengths. Your SAP system manages many businesscritical data streams and integrates them with other applications. Multiple users can access various functions from points around the world. All this happens in real time through a series of transactions. The data that you interact with is incredibly valuable to your organization and it can be accessed from almost anywhere. Other threats come from the fact that a cloud environment is a virtualized computer running an operating system on one or more physical hardware devices connected by network cables. Each of those four items offers its own vulnerabilities, and each has its own defenses, as we’ll cover throughout the remainder of this book. For now, let’s mention the threats briefly so you have a sense of the larger picture before we dive into the details with each chapter. An SAP system, no matter what version you use, passes data between an application layer and a database layer, as well as between the various attached components. Each of these data handoffs is a transaction. Your particular implementation may have multiple moving pieces, whether that’s a Human Capital Management or Financial Management add-on, and those pieces will send transactions between themselves. All those pieces—your database, your application, and your add-ons—may not live 22

in the same IT environment. In that case, each transaction operates in an exposed environment. Connections and transactions can become pretty complicated. Those transactions are referred to as in-transit data. It’s passing from one networked location to another. Clever hackers can snatch that transaction data while it moves from point to point. When data gets to its destination, it becomes data at rest. To get at this data, hackers can exploit vulnerabilities from a lot of angles. They can probe your network for weaknesses. They can forge or steal valid user credentials. They can physically access the server infrastructure. They can exploit flaws in the hypervisor. They can get in through known flaws in the operating system or database. However, you can configure defenses against these threats to minimize your risks. In fact, one of the biggest threats to your SAP implementation on the cloud comes from system misconfiguration. Whether that’s in your firewall, session management, key management, or transaction interfaces, misconfigurations can open your environment to threats that would otherwise have no chance. Configuration rule number one: Change your default passwords and keys. Imagine that you bought a big reinforced steel door, but then lock it using a key that every door uses. The defense is there, but without the proper configuration, it can’t do its job properly. SAP offers a lot of security features, including user access control and a public key infrastructure that provides encryption and authentication. SAP provides a checklist of recommendations for a new SAP install at http://help.sap.com/hana/SAP_HANA_Security_Checklists_and_Recommendations_en.pdf. With a cloud environment, you may have less control over the configuration of your server environment. To be honest, that’s part of what you are paying for—someone else who will manage server access, network security, patch management, hypervisor configuration, and more. If you operate a smaller organization without a mature security policy, this could improve your overall security. When you make a cloud environment integral to your business functions, you share in its vulnerabilities. While you can’t control how the provider configures a server environment, you can learn to ask the right questions. With this book in your library, you’ll know exactly what questions to ask.

23

CHAPTER 1


CHAPTER 1


Conclusion By now, you should have a better sense of what you’re getting into with a cloud deployment. There are a lot of benefits, but there can be a lot of challenges as well. We’ve provided a general sense of what composes a cloud environment, which we’ll expand on in great detail as we dig into what makes a cloud work, as well as some ways that it could break down and stop working. In the next chapter, we’ll introduce risk management and security standards. Risk management gives us a way to think about the business needs and threats that an organization faces so that when we design countermeasures and other ways to mitigate our risk of harm, we design them to fit existing needs. When we talk about security for a complex networked environment like a cloud, we build on the work that security researchers before us have put together. International standards are the result of that work and should be used as the foundation of any security program. We’ll also cover some of the regulatory frameworks to which your organization may be subject.

24

CHAPTER 2

CHAPTER 2

Risk Management and Security Standards In this chapter, we’ll provide an overview of both risk management techniques and security standards. Risk management will give you a context in which to understand the threat landscape that a cloud environment faces. Later chapters will build on the ideas we present here so that with every aspect of a cloud environment—the network, the hardware, the user controls—you’ll be able to understand whether a cloud provider’s standard setup meets your business needs or if you need to add additional security controls to its offering. After that, we’ll talk about security standards that your provider can certify to in order to prove its cloud environment secure. We’ll cover the major ones—SOC and ISO—in depth while touching on some minor standards. Finally, we’ll discuss some of the regulations that may apply to your data in a cloud environment. When your organization’s information security policies, standards, processes, technology, training, and people all work together to meet the operational needs of the organization, you achieve the appropriate level of security. When these are applied, you can effectively manage risk. So in order to talk about how we achieve good security, we need to put together a method of defining, identifying, and prioritizing risk. Fortunately, here at IBM, we’ve done a lot of work to create risk management strategies that you can apply to any IT deployment. It doesn’t matter if it’s an on-premise enterprise machine or a shared cloud provider; you can use our approach to determine what your risks are and start putting resources toward adjusting them.

25

2 | Risk Management and Security Standards

CHAPTER 2

A Risk Management Approach Information security is all about managing risk. Every business process has risk, and with cloud implementations, your risks involve anomalous flows in the system. That is, you run the risk of something not working the way it was supposed to, often under the direction of an external bad actor. In a secured system, there is a standard flow to user interaction: 1. A user or software component in an authorized role presents their credentials 2. To receive authorization and permission 3. To interact with or communicate with applications or other components 4. And access, modify, or distribute information assets. The anomalous flow happens when any one of these breaks down: 1. An unknown actor forges or hijacks credentials 2. To circumvent authorization and permissions 3. To watch other information flows and components 4. And steal, compromise, or delete information assets An anomalous flow can happen at any of these steps while including some expected behavior. The anomalous part could give the bad actor access to steps in the standard flow so that they use a flaw to gain access to standard flow processes. In 2014, Home Depot suffered a data breach on its credit card payment systems. The thieves stole third-party vendor credentials that allowed them to install malware that scraped the devices’ RAM and harvest credit card information in real-time. The breach hurt Home Depot’s reputation as consumers had their credit card details sold to the highest bidders. The breach cost them nearly $20 million to settle the lawsuit.1 According to our anomalous flow list above, here’s what happened: 1. Thieves hijacked credentials 2. That allowed them to legitimately receive authorization and permission 1

https://www.sans.org/reading-room/whitepapers/breaches/case-study-homedepot-data-breach-36367

26

Risk Management and Security Standards | 2

3. To watch information flows in real time Notice that the second step operates completely as expected. In a way, the first step did, too. The thieves found valid credentials, either through social engineering or another exploit. The real breakdown happened when they were able to install and monitor information on the device. The truth is that anomalous flows can be hard to spot. The normal flow follows a pattern; the anomalous one breaks it. It is the use cases that are not supposed to happen. From a management perspective, the risk is incurring a loss. Whether it’s a straight financial loss, a loss of proprietary or confidential information, or damage to existing assets, you want to minimize how easy a loss is to suffer and how much it hurts. For Home Depot, the loss was both in reputation and money. To plan for the possibility of loss, you need to understand both the system vulnerabilities and threats as well as how much the potential loss will cost you. Where the potential losses are the highest is where you spend your security dollars first. That is, you want to make the potential of an anomalous flow as near to impossible as you can. Any security architecture you implement has to be tied to business needs and available resources. You can design airtight security, covering every possible scenario and giving potential thieves no avenue of exploitation. But that’ll drain your bank account and send you on a path to bankruptcy.

Identifying Risks We use a simple three-step process to help our customers assess their risks: • Identify high-impact threats: What will cause the most loss if a specific vulnerability is exploited to expose a specific asset? And what’s the likelihood that a threat can exploit a vulnerability? • Assess risks: For each of these high-impact threats, try to attach some sort of numeric value to it. It doesn’t matter if that’s in dollars, reputation or goodwill, violations of law, or whatever. Find a value and justify it.

27

CHAPTER 2

4. And capture and steal information assets


CHAPTER 2

• Define countermeasures: Start building your security architecture. Figure out concrete steps—such as infrastructure, software, and monitoring—that will directly counteract a given high-impact threat. While the process may seem simple, getting it right is not. Figure out the real business impact of any given exposure. Create a list of assets and talk to the people who own them. They know what effect that asset could cause if compromised. For example, if you operate a public company, protecting read access to your financial data might not be the best use of your resources, as much of that will be reported to the SEC anyway. On the other hand, something that seems more innocuous, such as internal sales channel bonus tiers, could have repercussions that only the person in charge of it can understand. Here’s how you can break down this process. Start by identifying your assets and asset owners. These are the things you want to protect and secure, whether they are data, physical objects, or information flows. Consider anything that provides value to your organization, such as preliminary discussions on new products, network hardware, and customer credit card information. Also consider the data assets that would cause harm to your organization if they fell into either public or malicious hands. Next, take a look at your vulnerabilities and the threats that could exploit them. Your vulnerabilities are all the places where something could go wrong in your normal operating procedure—the anomalous flow. What’s the likelihood that one of these vulnerabilities could be exploited? You can break down the concept of “likelihood” into four distinct factors. Rate each of these on a scale of one to five and take the highest individual score, not an average or sum, as your likelihood of a threat: • Skill: How difficult is it to exploit this vulnerability? Can anyone with a laptop get in or do they need a Ph.D. in computer science to crack it? Rank high skill vulnerabilities as a one and low skill as a five, because low skill vulnerabilities are more dangerous. • Ease of access: How easily can someone get to the vulnerable location? Do they have to hop laser tripwires and access a physical machine or can they exploit it right from the front page on your website? Hard-to-access vulnerabilities are a one, easy access ones are a five. 28

• Incentive: What kind of rewards do they get by exploiting the vulnerability? This can be related to the impact of a threat (discussed below) but not necessarily. Compromising an asset may provide a lot of reward to a hacker, whether financially, reputationally, or in the satisfaction of accessing a hard-to-reach asset. Big rewards are a five, little to no rewards are a one. • Resources: What kind of equipment does your potential hacker need? Do they need a high-powered mainframe to exploit your vulnerability or can they get by with the computers at the library? Rare and expensive equipment is a one, common and cheap gear is a five. Think about who would want to break in and steal your data. Try to keep your conception of what a threat is at a very high level. If you start trying to identify every single possible threat, you could find yourself trying to boil the ocean; it’s a massive task that you will never satisfactorily complete. If you can address big picture problems, often you can find solutions that handle multiple threats. Finally, estimate the impact it would have on your organization. Impact is somewhat subjective, but try to pin a number on it, from one to five. What’s the cost of any incident that harms or exposes an asset? An impact of one is insignificant, hardly affecting you and your business. A five impact would be catastrophic, almost an existential threat. Your business would take some time to recover from it, if it ever did. Your risk is the likelihood of an incident happening and the severity of its impact on your business. Plot these on a grid, as shown in Figure 2.1. Anything in those top right squares needs to be handled. By handled, we mean you need to take steps that move that threat down and left in the grid.

29

CHAPTER 2



Likelihood 2

Almost certain

1

CHAPTER 2

Likely

3

Moderate

Unlikely

2

3 1

Rare

Catastrophic

Major

Moderate

Minor

Insignificant

Impact

Figure 2.1 An example of a threat matrix and the effect of countermeasures on likelihood of a threat

With risk, you have three options: accept it, mitigate it, or transfer it. You can combine all three. In fact, because you can never completely mitigate any given risk, you will likely end up accepting or transferring some amount of residual risk. Let’s briefly cover what each of these steps mean to a business. Accepting a risk mean you take the full brunt of the loss if and when a vulnerability is exploited. This isn’t always a bad option; sometimes the probability or value of a loss isn’t that great. Convenience stores don’t have much security around their candy aisle because if someone slips a chocolate bar into their pocket, the loss is pretty low. Mitigating a risk means closing vulnerabilities, creating countermeasures, and minimizing exposure. The bulk of what we’ll be talking about in this book will be how to mitigate the risks around certain aspects of cloud deployments. This procedure can be the most effective at eliminating risk, but will also be expensive.

30

Transferring a risk means finding someone else to pay for it. In general, that means insuring your assets against loss. Sometimes, though, that means contractually defining risk. In the Home Depot breach, individual consumers did not assume any risk, as that’s defined in their cardholder agreements. When you are mitigating a threat, you are essentially decreasing the scores on the highest likelihood factors. Add extra security levels and increase the skill needed to break in. Make it harder to access a machine by putting it on a private network or behind a firewall. Add encryption so that any brute force attacker needs a beefier system to run all possible keys before they die of old age. To move your threats left on the grid, you can take some steps to lessen the impact through things such as redundant systems, frequent backups, mirroring, and system integrity controls. After countermeasures like these, your primary tool here might be transferring the remaining risk. If the exposure of an asset would cause millions in lost sales and potential lawsuits, then get a policy on it and let your insurance broker shoulder some of the overall risk. When designing security countermeasures, look to the most effective solution. Does it cover many possible anomalous flows? Is it within the organization’s policy scope? And always important, is it cost effective? Maybe an uncrackable encryption scheme protects your data 100 percent of the time, but it costs millions of dollars. If there’s a much cheaper protection that handles almost all the threats, that’s worth considering. In your best case scenario, you’d assess every risk and create countermeasures for them all. Here in the real world, we have limited time and money. Use the 80/20 rule: 80 percent of the problems come from 20 percent of the causes. Address the top 20 percent of your problems and you’ll cover most of the issues your business will run into. This should reflect your organization’s appetite for and willingness to accept risk. Look at your local bank. A given bank branch had about a 4 percent chance of being robbed in 2015. To protect themselves, they could post a dozen security guards at their entrance, put their tellers behind bullet-proof glass, and require all sort of identity checks on the way in. But they don’t. That’s because there’s one, less expensive technology that catches 93 percent of bank robbers: the video camera.

31

CHAPTER 2



CHAPTER 2

Security Standards While the security landscape is always shifting, some very smart people have gotten together and created security standards. These standards lay out what any cloud system exposed to public and private networks should provide. The best of these can be certified or audited to verify that an organization meets the requirements of that standard. The problems start when you consider how many various standards have been created, sometimes to meet the security needs of very specific industries. Cloud providers have their own security policies and customers have their own security, audit, compliance, and regulatory requirements. Sometimes these two overlap. To prove it, cloud providers could spend a ton of money getting audits for each customer. Instead, cloud providers now get a single audit report to measure how well they comply with certain globally-accepted security and control standards. Certified public accounting (CPA) firms first developed security audits thanks to their experience auditing the financials of large, complex organizations. Through these audits, CPA firms were also the first to develop usable standards. SOC 1 A Service Organization Control (SOC) 1 report measures the controls that an organization has over their financial information. It uses the Statement on Standards for Attestation Engagements (SSAE) No. 16 developed by the American Institute of Certified Public Accountants (AICPA) as its standard. This replaced the older SAS70 standard, also developed by the AICPA. There are two types of SOC 1 reports: Type I details controls in place at a point in time and their design, while type II evaluates the effectiveness of controls for a period of no less than six consecutive months. As most SAP systems deal with auditable financial information, SOC 1 is a valuable measure of the controls you have within any cloud environment. If your organization provides services that could affect another company’s financials, you may be asked to provide them with an SOC 1 report.

32

SOC 2 While SOC 1 reports will often show compliance with global security standards, that’s not why they exist. SOC 2 reports measure the controls that protect the security, availability, integrity, confidentiality, and privacy of any non-financial information an organization processes. This report uses the AICPA’s Trust Services Principles and Criteria (TSP) Section 100 as its standard. A SOC 2 report will be more useful to you in selecting cloud providers, as it covers more than just the controls on financial information. The audit that produces these reports challenges cloud providers with a comprehensive list of the security controls in each of these seven categories: • Organization and management: How the organization functions to both support and assess the people who handle data • Communications: How the organization communicates policies, obligations, and requirements to authorized users of the system to keep the system running smoothly • Risk management and design and implementation of controls: How they identify, analyze, and respond to risks, as well as how they monitor this process • Monitoring of controls: How they monitor their information system, including the effectiveness of the security controls in place, and how they address gaps in security • Logical and physical access controls: How they limit who gets access to physical and virtual operations, manage incoming and outgoing users, and keep out unauthorized users • System operations: How they make sure policies and procedures are followed, as well as how they spot and handle instances in which those policies aren’t followed • Change management: How they figure out what changes need to be made and do so in a controlled environment, as well as how to prevent unauthorized changes Ask any potential cloud provider for the SOC 2 report. It’ll give you a third-party verified picture of the provider’s security profile. You can be confident in any information in this report, as the auditors have to get full visibility into the cloud environment to compile it. This report should be refreshed every year. 33

CHAPTER 2


CHAPTER 2


SOC 3 The SOC 3 report is prepared by the same auditor that puts the SOC 2 together, except that this report is intended for any interested party. It contains two sections from the SOC 2: the auditor’s opinion on the performance of the cloud provider’s controls, and a description of the system environment. You may even find a provider’s SOC 3 report posted publicly. However, any provider you consider should share the SOC 2 report with you once you sign an a nondisclosure agreement (NDA). ISO The International Standards Organization (ISO) created standards that will help you manage information security in your IT department and beyond. The standards are designed so that any organization can use them to manage their information security. There are four important standards for cloud implementations: • ISO 27001:2013: This details a specification for an information security management system (ISMS). With over 114 controls in 14 groups, it describes how your organization can manage risks. These controls aren’t mandatory steps you need to take in order to become secure; instead, it evaluates how well your organization’s ISMS handles risk. You can be certified ISO 27001; in fact, some clients and governments require it. You should require it, too. ISO 27001 certification is table stakes to play in the cloud provider game. Independent third-party auditors can certify an organization, valid for three years. However, yearly surveillance audits to assure that compliance continues are also required. • ISO 27017:2015: This gives guidelines for information security controls applicable to cloud services. Where ISO 27001 provides recommendations on how to implement an ISMS, this standard builds on those and adds new ones that focus on an ISMS for a cloud environment, either as a provider or consumer. While your provider can’t be certified to this standard, it can get a Certificate of Conformance. It demonstrates that an independent, third-party auditor looked at the security controls and found they met the recommended practices here. Like ISO 27001, this certificate is valid for three years and requires yearly compliance audits. 34

• ISO 27018:2014: This establishes objectives, controls, and guidelines to protect personally identifiable information for any organization that processes it in a cloud environment. Like ISO 27017, this builds on ISO 27001 to add controls and guidelines about protecting personal information. You can get a Certificate of Conformance through an independent, third-party auditor, valid for three years with yearly surveillance audits to ensure continued compliance. • ISO 22301:2012: Unlike the other standards, this one covers organizational and management policies instead of technological concerns. It details how to establish a business continuity management system (BCMS), which helps an organization’s operations continue in the event of disasters. Cloud customers expect continued access to their services, even in the event of earthquakes, floods, and other tragedies. This framework will help you or your provider create, maintain, monitor, and continually improve a BCMS that helps protect against any sort of disruptive event. A provider can certify to this standard using a third-party auditor, which is valid for three years with yearly surveillance audits. Some providers will not certify to this standard. Instead, they will contractually transfer this risk to you, the customer, so ask about this standard when shopping for providers. Neither the AICPA nor the ISO standards are freely available; the standards documents are available for purchase from these organizations. But because of the reputation and reach of these organizations, the standards are built with input from an international community of experts. By following these standards, you’ll be following the best advice from the world’s security experts.

Regulatory Frameworks Depending on your industry and country, you may also need to meet certain other regulatory requirements about data storage and protection. These are legal requirements, not standards that make your provider look competent. You can ask if a provider will help you meet these regulations, but for most of these, the burden of compliance falls squarely on your shoulders. 35

CHAPTER 2



CHAPTER 2

Below are a few of the more common ones, but there may be others that your organization needs to comply with. Talk to your legal team to make sure you know which laws apply to you. PCI DSS If your organization processes, stores, or transmits credit card data, you must comply with the Payment Card Industry Data Security Standard (PCI DSS). In 2008, the founding members of the PCI Security Standards Council (American Express, Discover Financial Services, JCB International, MasterCard, and Visa Inc.) created this standard to protect cardholders from exposure to fraud and theft. Unlike the other regulatory requirements in this section, the PCI DSS standard is maintained and enforced by an industry group. There are 12 standards that must be met. Depending on how much control you have in the cloud environment, these may be your or the provider’s responsibility. If you process credit card data, make sure you find out which of these 12 controls fall on you and which fall on your provider. Ultimately, the responsibility is on you, and any punishment handed down falls on the party with a relationship to the credit card company. A Qualified Security Assessor (QSA), as designated by the council, can assess a data environment for compliance annually. If the assessor validates compliance with the standards, then the organization will receive an Attestation of Compliance. If you need a compliant provider, this is what you need to ask about. Although the industry requires that this standard be met by card data processors, it does not require validation. However, if there is a breach and the victim was not compliant, they may be subject to additional penalties from the card issuer, including fines. HIPAA and HITECH If you process or store medical information, you’ll be subject to additional data privacy regulations. In the US, this falls under the Health Insurance Portability and Accountability Act (HIPAA) and the Health Information Technology for Economic and Clinical Health Act (HITECH). HIPAA mandates security and privacy requirements for doctors, health plans, hospitals, and other healthcare providers. HITECH extends those requirements to any business associates of the healthcare provider that handles or stores protected information. That includes cloud providers. 36

Like the PCI DSS, you and your cloud provider may be responsible for different portions of the HIPAA standard. You should determine in your contract who is responsible for what aspect. Any healthcare provider or their business associates could be audited by the Department of Health and Human Services (HHS) Office of Civil Rights. You and your provider cannot certify to HIPAA standards; the only “certification” available involves successfully completing a random audit. HIPAA only applies to organizations that operate in the US. Other countries have different privacy standards and have enacted different laws to protect medical information. European Union (EU) Data Protection Directives Any organization that uses equipment in the European Union (EU) must comply with its strict privacy rules. The European Commission Data Protection Directive 95/46/EC regulates processing and transfer of personal information on individuals residing in the EU. Cloud service providers are considered “data processors” and must follow these strict laws when they process data with equipment located in the EU. Your organization, however, may be subject to these laws if you gather information from customers using a web interface, as the customer’s computer is technically equipment located in the EU. The directive requires that data be processed only if the subjects have been notified that their data is being collected and is being used for a specific legitimate purpose without being excessive in regards to that purpose. Data, in this case, means any information that can link to a person, even if the holder of that data cannot use it to identify someone. If the data is sensitive—health, sexual orientation, or religion—then extra protections apply. Your provider can’t certify to the directive, but it may be required to include protections that comply with the law in its contracts with you. Individual EU member states enforce the directive and may have added additional standards. However, in 2018, a new directive will create a central enforcement authority and establish uniform standards across all member states. This new central authority will begin testing privacy controls in place to ensure their effectiveness. The bottom line is if you operate in the EU and gather any personal information, including addresses, you will need to comply with this directive. 37

CHAPTER 2



Other Governmental Regulations and Standards Depending on the nature of your organization, its location, the locations in which it does business, or where its employees and contractors reside, you may be subject to one or more of the following standards: CHAPTER 2

• Argentina Personal Data Protection Act 25,326 • Canadian Privacy Laws • Australian InfoSec Registered Assessors Program (IRAP) • Chinese Information System Classified Security Protection • Japanese Cloud Security Mark • UK Government G-Cloud • US Federal Risk and Authorization Management Program (FedRAMP) • New Zealand Cloud Compliance Framework • Spain National Security Framework This only lists the most common standards; others may apply.

Conclusion Now that you know how to understand the risks that you face and how to manage them, you can start determining what risks the assets exposed in your SAP system face. In the remaining chapters, we’ll build on this discussion of risk and standards to discuss the specific threats and vulnerabilities you’ll face with a cloud-based SAP implementation, as well as concrete steps you can take to minimize the risks from them. Much of the responsibility of what we discuss will fall on your provider, but you can shop for cloud solutions armed with the right questions. You should also understand which standards and regulations you should ask about. Depending on the nature of your organization and data, other standards and regulations may apply, but you should have a sense of how to approach a provider. In the next chapter, we’ll start getting into the nitty gritty of the threats and countermeasures around the physical infrastructure of a cloud environment. That’s the building, the machines, the people who maintain it, and more. If you can hit it with a rock, we’ll be discussing it in the next chapter. 38

CHAPTER 3

Physical Security

Before I start talking about the individual layers of security your cloud provider should have in place to protect your SAP systems, I’d like to introduce you to the Open Systems Interconnection (OSI) model, which abstracts a network into various conceptual layers. This model has been used to create the protocols and processes that make telecommunication networks function. This will be useful to you to understand how data moves around on our cloud network and where security protections operate. This model describes seven layers that compose a telecommunications network, each of which serves the layer above it. The layers are listed in descending order because the layer on top is served by all the layers below it. 7. Application layer — The data that the application itself uses to present information to the user. Includes protocols such as HTTP for web browsers and SMTP for mail programs. 6. Presentation layer — Data that can be translated into application data. This includes XML and decryption processes, and although encryption often happens at this layer, it can also happen at the session, transport, or network layers. 5. Session layer — As the name implies, session protocols negotiate opening data stream sessions between two application processes in this layer. This stream can occur across a network or within a single application. This is a pretty specific layer that includes SOCKS, the secure internet socket exchange protocol, as well as applications that create smooth transitions between live TV programs and manage interruptions in web conferencing connections. 39

CHAPTER 3

The OSI Network Model

3 | Physical Security

4. Transport layer — This layer uses protocols that actually move the data from one endpoint to another. TCP/IP and UDP are both layer 4 protocols. 3. Network layer — This is where traffic routing and packet forwarding occur. It determines where to send the data. Think IP addresses and port numbers; this information operates on layer 3. 2. Data link layer — This layer contains information about neighboring nodes on the same LAN or WAN. This information travels through frames, which are broadcast to other nodes on that network. Ethernet is the most common protocol for this layer.

CHAPTER 3

1. Physical layer — The bottom layer of the OSI model is the physical hardware on which computing happens. It’s the wires, server boxes, and hard drives over which data travels as electrical impulses. If you want to have a protocol here for completeness sake, it’s electricity, electromagnetic signals in wireless, and any mechanical processes that manage data transmission. You may have also seen the TCP/IP networking model (which is different from the TCP/IP protocol, though they are related), which uses almost the same layers, except that it absorbs the presentation layer into the application layer, and the session layer into the transport layer. This is a simpler model, but many discussions of networked environments still refer to aspects using the seven layer OSI model, which is what we’ll rely on. The rest of this book roughly goes through security countermeasures and controls that your cloud provider can take starting with the bottom layer, the physical layer, and moving up from there. In that way, we’ll follow the purpose of the model in that each chapter will serve the chapter that follows it. Again, this model won’t fit our discussion perfectly, but it provides us a pathway to understand the security challenges involved in a complex networked environment such as the cloud. We’ll refer back to this model when we talk about the other layers in which you and your cloud provider may implement countermeasures.

40

Physical Security | 3

Your cloud provider runs your SAP system and stores the data on real, physical machines, so for the rest of this chapter, we’ll talk about the physical layer of the OSI model, layer 1. The cloud systems have the same weaknesses of any other physical object: They can be damaged or destroyed by accidents, environmental factors, or malicious attacks. So can other physical support elements, such as the building, power supply, and the network cables. Often, these effects may be unpredictable. For example, in 2007 a truck crashed into a power generator that fed into Rackspace’s data center in Dallas, knocking the cooling systems offline. This kind of random accident happens without warning, but other events such as extreme weather and earthquakes can be planned for and mitigated. The other primary physical threat comes from people who have access to the data center, whether that comes from theft or incompetence. Not only is the equipment in a data center valuable, but so is the data it contains. It’s possible that an individual with physical access to a data center could take equipment, or individuals with malicious intent could also install malicious code through insecure physical interfaces; physically accessing a computer is the easiest way to break into it. The dangers that a data center faces from man and nature are legion, but good physical security can limit how much impact they have on your cloud provider. Most of the threats to physical infrastructure come from accidents more so than malice, so a provider’s defenses need to make its center as resilient as possible. For the remainder of this chapter, we’ll dig into the specifics of the controls and countermeasures that create that resiliency. Depending on the value/criticality of your data, you may need stronger controls. You may only worry about continuity and uptime, so your provider only needs to be prepared for disasters that would knock it offline. But if your data has value to people outside your company, then you may want to pay for tighter controls. As with all security controls, it depends on your business needs and appetite for risk. In this case, it depends on your provider’s overall risk profile. That risk profile is usually priced into its cloud offerings, so when shopping around consider what your data requires to keep it safe and what the high-impact threats are for you.

41

CHAPTER 3

Physical Threats


Disaster Protection Accidents happen. Whether it’s a force of nature or man-made, you can’t always foresee a disaster. But a cloud provider can take some steps to minimize the impact that damage and destruction has on your cloud environment and your access to it. We at IBM like to separate the effects of these external and environmental threats into three categories based on what they affect: • Paint — The building and facilities, including the hardware inside.

CHAPTER 3

• Pipe — The network connection and cables that link the center to the outside world. • Power — The utility systems, including electricity, that allow the data center systems to operate. Your cloud provider should have plans to protect each of these, as catastrophes great and small can affect any part of a center and separate you from your cloud data. Let’s dive in and explore how a cloud provider can reinforce each of these areas to prevent downtime. Paint A cloud data center exists in a physical location, and every physical location has its own environmental challenges. Your cloud provider should understand those challenges and how its physical structures can handle them. For example, IBM’s data centers in earthquake zones, such as Tokyo and the West Coast of the U.S., are housed in quake-resistant buildings designed to minimize vibrations during seismic events. Our Raleigh, North Carolina, data center was recently hit by severe hurricane-level weather events, including Hurricane Matthew, the state’s deadliest ever. The data center was able to continue with operations without interruption. Find out which of your provider’s centers are closest; chances are that center will be the one that houses your data. What sort of environmental conditions occur there? Is it a flood plain, and if so, how do they keep the servers dry? What else is near the building: highways, power stations, population centers? All of these contribute to your uptime in extenuating circumstances. 42


High-powered computers run hot. A data center needs to have appropriate cooling measures in place. These aren’t air conditioning units designed for the comfort of the people in the building. It’s not uncommon to need a jacket to enter the server rooms. Many data centers previously used a raised floor system, where the cooling systems run under the floor, filtering cold air through floor tiles into the server uptake, as referenced in Figure 3.1.

Mixed Air

— — — — — — — — — — —

Server Rack

Server Rack

Hot Air

— — — — — — — — — — —

CHAPTER 3

CRAC Unit

Cool Air

Figure 3.1 A typical raised floor server cooling system

This older method of cooling worked for less efficient server designs. Today, though, newer techniques have been developed to keep up with server technologies. Many server rooms now use hot aisle and cool aisle configurations, where heat vents blow air into a contained aisle, where it is taken and cooled, then fed back into the server intake. Other techniques use non-electrostatic liquids to move heat away or large scale heat sinks like bodies of water, as referenced in Figure 3.2.

43


Supply Air Cool Air

Hot Aisle Enclosure (Exhaust)

— — — — — — — — — — —

Server Rack

CHAPTER 3

CRAC Unit

— — — — — — — — — — —

Server Rack

Hot Air

Figure 3.2 A typical hot-aisle server enclosure

Humidity and dust particles can cause problems as well. People used to think low humidity led to electrostatic discharge, but recent research has debunked that idea. Besides, a technician can prevent most shocks by wearing a wrist strap. The real humidity problem comes from high humidity, which can cause dust to stick to server components and prevent heat from dissipating properly. Like any business with important equipment and people, your provider’s data center should have a fire suppression system in place. All that server heat and electrical current can spark a fire. But because the server room may be kept nearly freezing, your provider should be using dry pipe sprinkler systems to handle flames. Dry pipes are filled with pressurized nitrogen until the smoke detectors catch wind of fire. At that point, the nitrogen releases and allows water through. Water shutoff valves should shut the water off once the fire is out to prevent any equipment damage. The ultimate protection for data at a data center is more data centers. If your provider has multiple locations, it may use co-location to provide you with redundant access to your SAP platforms. This depends on your setup; in some cases, you may not want co-location. But a provider with the possibility of multiple data centers can assure you that if disaster strikes, your data will not fall with the building.

44

Pipe For you to access your cloud environment, there needs to be either physical lines of cable running between the client computers and the data center, wireless or satellite connectivity, or both. The primary medium for network connectivity is the cable, so we’ll focus on that. Most wireless networks connect into a wired network at some point anyway. The physical point of connection that matters most happens between the building and the outside world. That network cable can be cut or otherwise destroyed; I’ve read horror stories about server techs looking for cut cables outside their buildings and finding a poor fried squirrel with its teeth in a severed line. In 2011, Level 3 communications reported that squirrels caused 17 percent of the total damage to its fiber optic cables. Those critters can hit cables at any point between the data center wall and your computers. The data center can minimize how much effect those tree-borne rodents have on system uptime by keeping the connection cable out of sight. Unfortunately, they can’t do much about the cables outside their property lines. In fact, the cable infrastructure can be more delicate than you’d expect. In 2008, ship anchors were blamed for a rash of undersea cable cuts, which left entire countries with slow or non-responsive service. The transcontinental cables are no safer; I know a sales engineer who has a transcontinental cable pass under her property. If she needs to do any work, she has to clear it with the local authorities. To account for the fragility of cables, the major Internet backbones have multiple, redundant cables to handle traffic. Your cloud provider’s data centers should, too. Ask if it has connections from a single provider, or if it contracts from multiple companies for connectivity. You don’t want a careless worker with a backhoe a few miles away to be able to deny you access to your business-critical data. Power Without a continuous electrical feed, every other security countermeasure your provider has in place is useless. The servers won’t run, the networks won’t transmit data, and the cooling system will shut down. Depending on the access security measures, you might not even be able to open the doors to the server room.

45

CHAPTER 3


CHAPTER 3


As with network connectivity, you want a provider that doesn’t just rely on one utility company to keep the lights on. Most of us have experienced blackouts, whether it’s because of downed power lines in a storm or overloaded generators failing because everyone in the city turned their air conditioners on. You don’t want a blackout at the data center to knock you off your cloud-deployed SAP system, sometimes from miles away. In the event of catastrophic power grid failures, your provider should have an emergency backup plan. If every power company within 100 miles goes dark, your data centers should be able to fire up a diesel generator or other local power source and keep data flowing, at least until your provider can route your connection to another data center. Redundant power sources and backup generators are par for the course for data centers, though. What really makes them stand out is how well they test their systems. How often do they check their generators? Monthly? Yearly? Do they have reserves of fuel? What’s their maintenance schedule look like? This may seem excessive, but without a plan for power failure, a cloud provider is just asking for trouble. Back in 2011, IBM got a chance to run a real-life test on our SoftLayer facility in Dallas. Superbowl XLV brought an estimated 300,000 fans into the city to see the Green Bay Packers beat the Pittsburgh Steelers. The Dallas metro area power grid was sure to be stressed during that time, so SoftLayer removed all of its data centers from the grid and ran off diesel generators for three days. Not only did we help reduce the load on the local utilities, we made sure our diesel generators and fuel delivery process ran smoothly.

Business Continuity As I’ve shown here, some of the biggest problems a data center can face come from unexpected physical threats. Natural disasters, utility failures, and political instability can all cause serious physical damage to a data center and deny customers access to their business-critical data and applications. While certain disasters (such as a zombie outbreak or alien attack) may have such a worldwide effect that data access becomes secondary to survival, those sorts of events only happen in the movies. For the smaller, more localized disasters, your provider should have a business continuity management system (BCMS).

46

In Chapter 2, we talked about ISO 22301:2012, which covers how to establish a BCMS for an organization. The basics require organizations to understand their business role, what their clients expect, and the risks that could frustrate their customers’ expectations. They develop a plan, let everyone know what their roles are, and prep for the worst. Think of fire drills; you probably never had to escape a burning building, but those drills taught you what you needed to do in case it happened and repeated it until it became muscle memory. Your cloud provider needs to establish what the tolerable levels of loss of service or data are in the event of catastrophic failures. Let’s say Godzilla hits its data center, destroying the whole building and all equipment. Is it acceptable that a customer loses all of its data in this case? Probably not. You as the customer would hope that the provider had some sort of offsite backup or server redundancy in another center. An organization may need to create plans for each type of potential disaster that could affect its center. For example, when a large-scale infectious disease hits, an organization may want to split its workforce into separate teams and rotate them between multiple sites with a frequency matching the incubation period of the disease. This happened during the 2003-2004 SARS outbreak in China and the surrounding area. You don’t want one infection to cause a quarantine for your whole company. With this rotation in place, only one team would be quarantined. You should have continuity plans in place for personnel disasters as well. The U.S. government, as a contingency plan, requires one member of the president’s cabinet or Congress to stay away from the State of the Union address in case of an attack. That way, there will be at least one person who can continue the existing government. Your provider doesn’t need a plan as drastic as this, but you don’t want a tragic car accident to remove critical knowledge from your provider’s organization. Through the use of scenarios, your provider can locate measures it will need in order to maintain continuity. These scenarios can also help train and test staff on what they need to do in the event of specific disasters. You don’t want your provider’s staff to spend time re-reading their employee handbooks in critical situations. The better prepared your provider is, the more secure your data is.

47

CHAPTER 3



Building Access

CHAPTER 3

Not all data center failures are due to environmental effects; some happen because the wrong person got into somewhere they weren’t supposed to. That wrong person might even be an employee or a vendor, someone who was trusted in some way, but had too much access. In this section, we’re going to discuss the controls a data center can have in place to manage who gets into the building and how it can control what those people do once they are inside. Perimeter Defenses Good access control starts before a person gets to the front door. Think of the data center like a bank vault; instead of gold, this vault holds business-critical data. Depending on the size of the vault, you need to get past different levels of security to get to the building. Your neighborhood bank might have a camera outside and keep its primary defenses inside. To get to the Fort Knox gold depository, though, you need to pass minefields, razor wire, and any of the U.S. Army units stationed at the base. A data center’s security might start in where it chooses to locate it. Cloud providers rarely build a data center in a dense urban area; instead, they choose locations outside cities. Besides offering less expensive real estate and less burdened utility grids, they may allow providers to control roads in and out. Wherever the building is, a good data center is anonymous. The building blends in as best it can, looking like a warehouse or office complex. No logos or windows into the server rooms give it away. If the center is beset by thieves, it won’t be because they found an opportunity driving by; they’ll need a greater level of knowledge just to know what the building is. Figure 3.3 shows the exterior of an IBM data center as an example.

48

Figure 3.3 The exterior of an IBM data center

With fewer land constraints, a data center can put up fences and access controls around the single entry point. From that entry point, the design of the landscape can guide people through a single camera-monitored path. The more a building’s outside layout can funnel all approaching traffic, foot or vehicle, into a single approach, the more likely those cameras will catch suspicious activity. In fact, cameras will be the building’s best friend. They’ll give your security personnel eyes on the outside of the building to spot unusual behavior or to review after something goes wrong. To support them, the grounds of the building should be well maintained, keeping undergrowth to a minimum. A thief loves shadows, so your providers should keep all possible paths well lit. Getting Inside A data center should have one primary entrance. Any employees, vendors, or visitors should all enter through the same door, sign in with the same receptionist, and then go where they need to go. Like the outer premises layout, this funnels all traffic into a single point, leaving less space to monitor. A single camera can catch almost everybody coming to the data center. I say almost because there will usually be other possible entrances. The building may have a delivery dock for larger equipment. That needs 49

CHAPTER 3


CHAPTER 3


to be tightly controlled to make sure vendors or people posing as vendors can’t slip past the gates. Emergency exits (often mandated by law) should not have outside handles or other visible markings. Once an authorized employee or visitor gains access to the building, that does not mean they can go anywhere they like in the building. Building access permissions should follow the principle of least privilege: every person is only allowed to enter the areas that they need to be in so they can do their job. The server techs shouldn’t access the diesel generator and the maintenance crew shouldn’t have access to the server room. To control access, a data center might require ID badges at all times. Secure areas often require specific badge rights, plus two-factor authentication. Basic two-factor authentication requires something that the user knows, such as a passcode, and something that they have, such as a biometric identification. Biometric authentication uses qualities unique to an individual human; most commonly, this is a signature or fingerprint. However, this could be retina and iris patterns, voice waves, earlobe geometry, or even DNA scans. Every data center may have visitors, such as third-party vendors and high-value clients. Your providers need to consider how to allow them access, if at all. Are they badged for access to a limited area? Do they get a minder assigned to accompany them throughout the building? Are they given a firm hand handshake and shown the door? How tight a data center keeps its security will depend on the data it stores there. Internal security can vary, too. Most data centers will have security cameras and guards, but how many and where can vary. Guards should at least be posted around the main entrance and at entrances to sensitive areas, such as the server room. Cameras are easier to post and won’t get cold in a server room, so they can monitor equipment easier. But how long the provider saves those recordings and whether someone monitors its live feeds at all times will depend on individual policy.

Human Resources The most important resource in a data center might be the people who maintain it. Your provider should have security controls built around ensuring those people are as good as the equipment in the server room. Everyone, including temporary or contract staff, needs to go through prehire screening, training, and exit procedures. 50

Hiring and Onboarding For any open position in a data center—or for that matter, any organization—applicants need to fill out forms that provide their personal details, work history, and background. Depending on the skill level of the position in question, they may need to pass some sort of pre-employment testing. Data center employees can bolster their employment records with professional certifications, some of which require regular recertification to ensure their skills stay current with rapid technology changes. All these details need to be verified, possibly by a third party, possibly by someone working for the provider. On top of that, the provider should run a background check. Because a data center contains expensive electronic equipment and sensitive data, it needs to make sure anybody it lets near them has a clean record. These checks can include criminal records, identity verification, and credit reports. Once hired, all employees your provider hires need to sign non-disclosure agreements (NDA) or other confidentiality agreements. It’s important that they aren’t disclosing breaches and security weaknesses to competitors or curious strangers at the coffee shop. Details about the center, especially about its security measures, cannot be freely shared. If a potential bad actor knows what’s in place to stop them, they can figure out how to work around it. To get the new hires started in understanding how to keep the building secure, they should be given some materials that detail the various information security policies in place—including physical, environmental, and data-related. Ask if your provider has a validation process once an employee has read the introductory materials. Training Educating employees about security procedures and policy shouldn’t stop after their first paycheck. In order to keep up with changing technologies and security practices, cloud providers should maintain a training program at their data centers. Technology changes fast, and standards and regulations are always trying to catch up with what’s possible. Your provider should make sure that its people have the best information possible. Here at IBM, we have pretty extensive training programs. Everyone, no matter what they do for the company, has to take yearly information security certification training. We feel that information security is too important to expect anyone to figure out on their own. A coordinated 51

CHAPTER 3



CHAPTER 3

defense is a stronger defense. Depending on what our job role is, we may have to take additional courses. Besides internal training, lots of organizations offer certifications and other training services. Your cloud provider should have some sort of continuing education program to make sure its security practices stay current with the rest of the industry. An employee who goes out in the world for training then becomes a resource for the company, able to disseminate their new knowledge to their coworkers. Exit Procedures Just like when a new employee joins a company, your provider should have procedures in place for when employees leave. Regardless of what the reason is for their exit, your provider follows a consistent checklist that removes access and continues to maintain a secure environment. While this individual may have been a valued member of the team, once they leave their position, they should be treated like any other civilian. This will look like granting access in reverse. The provider should reclaim their badges and remove their biometrics from the list of those that grant approved access. Depending on security policies, this may also mean that the center changes any access codes that they used. While not strictly necessary, as access should additionally require either a badge or biometrics, new codes all around can prevent a disgruntled ex-employee from sharing one piece of the access puzzle with potential bad actors. It’s important that your provider be able to do this quickly and silently—quickly, because the longer a suddenly unauthorized person has access, the more risk there is in those credentials passing into the hands of others, and silently, because you don’t want your provider sharing details of this process with others. You may even want to test them by asking what their exit procedures are; if they give you a generic answer or say that they don’t share details about this, then you can be assured that they are on top of it. Exit procedures like this seem like a no-brainer. But it’s very easy to overlook a step or two until the IT department suddenly notices that the tech that quit months ago still has an authorized fingerprint in the system. The most important part of exit procedures is having a plan and policy that everyone follows every time.

52


The physical security of your cloud environment lies almost entirely in your provider’s hands. In some ways, this is a positive. If it were to allow you access to its facilities to help with its security, it could point to a serious flaw in its controls. Your contact with its cloud facilities should be almost entirely virtual. However, you can apply some of the physical security measures mentioned here to your organization. In particular, you want to maintain good access controls in your offices and other buildings. The easiest way for a bad actor to access your data is to get access to a physical computer and log into the cloud environment legitimately. In short, your access controls should be resilient enough to withstand social engineering hacks. Social engineering relies on social systems to be effective. It’s nothing new; this is the realm of the con artist. The attack could be a person posing as a vendor, asking for credential information, or pretending to be an employee. Most people, when faced with a person in need or bearing gifts, will have a hard time denying them. Ask the ancient Trojans how that worked out. Proper badging, two-factor authentication measures, and cameras at entrance areas can help frustrate would-be con men. Clear policies on access and regular training make sure your co-workers know their roles when faced with a charm offensive.

53

CHAPTER 3

Conclusion

CHAPTER 3


54

CHAPTER 4

Today, most computers are networked by default. You probably connect to the Internet automatically without even thinking much about it. It’s so prevalent that electronic devices that we wouldn’t even think of having networking capabilities 10 years ago now have Wi-Fi enabled from the start. Your car, your kitchen appliances, and even some LED light bulbs now have some sort of networking features built in. Cloud computing wouldn’t exist without this network saturation. The power of the cloud comes from remotely connecting to a virtual computing infrastructure without noticing the seams between your computer and the cloud. But that network opens up vulnerabilities, so we need security countermeasures based on our risk profile and known threats. Connections to larger networks, including the Internet, allows for free flow of information, but that freedom can expose your data to bad actors. When we talk about networks, we mean everything that links one computer to another, both physically and virtually. It’s the fiber optic cables connecting a data center to the outside world, the routers that get traffic where it needs to go, and the security appliances attached to the network. It’s the IP addresses, Domain Name Services (DNS) servers, Virtual Local Area Networks (VLANs), and the host ports that listen for incoming traffic. These make up the networking infrastructure of your cloud environment, up to and including the networking interface card that lets your organization’s employees connect to that cloud. Remember the Open Systems Interconnection (OSI) model we talked about in Chapter 3? For this chapter, we’ll be mostly talking about layers 2 and 3, as these control the movement and identity of networked nodes. For reference, here’s a review about those layers: •

Network layer — This is where traffic routing and packet forwarding occur. It determines where to send the data. Think IP addresses and port numbers; that information operates on layer 3. 55

CHAPTER 4

Network Security

•

Data link layer — This layer contains information about neighboring nodes on the same LAN or WAN. This information travels through frames, which are broadcast to other nodes on that network. Ethernet is the most common protocol for this layer.

Threats and Vulnerabilities Here’s a small sample of the threats that your network may be exposed to: • Hackers and other cyberattacks. The term “hackers” covers a huge range of people, from amateurs cracking servers for laughs to international criminal groups and state actors. • Malware. Whether because a valid network user installed something they shouldn’t have or a hacker slipped your defenses and installed it themselves, malware gives bad actors access to your computer. They can take screenshots, log keystrokes, send and receive network traffic, or even encrypt your entire hard drive and demand a ransom to unlock it.

CHAPTER 4

• DoS attacks. Denial of service (DoS) attacks attempt to disrupt a network-connected device by flooding it with additional traffic. Most commonly, a hacker will set up a botnet—hundreds or thousands of computers infected with malware—to coordinate mass, continual ping requests, referred to as a distributed DoS attack. Some DoS attacks take advantage of listening services that produce a larger scale response. For example, hackers have in the past spoofed an IP address on network time protocol requests to take down popular gaming sites, such as League of Legends. (http:// arstechnica.com/security/2014/01/dos-attacks-that-took-down-biggame-sites-abused-webs-time-synch-protocol/) • Misconfigurations. The bane of every IT professional rears its ugly head once more. This can include keeping default settings, opening too many ports on your firewall, using dev accounts on your production servers, and plenty more. Proper configurations use unique settings and follow the principle of least privilege, where applications and devices have the minimum privileges needed to carry out their tasks.

56

Network Security | 4

With your SAP cloud implementation, your biggest vulnerabilities come from the contact points between your enterprise environment and the cloud. This opening may link to the wider Internet on both sides, so it can be discovered based on IP addresses. This doesn’t necessarily mean that your data stores have a direct connection to anyone who guesses the server IP address. Most cloud systems have layers of protection between the application layer, where you connect to your SAP interface, and the database layer, where the data lives. The routers, firewalls, and other intermediaries that lie in the middle offer protection, but can also be compromised by more sophisticated hackers. In today’s connected society, everyone has a network-enabled device in their pocket: their smart phone. These phones may even connect to your organization’s network under bring your own device (BYOD) policies. But these connections can be problematic. Those phones could be infected with malware that can capture and forge the valid credentials they receive. While not that much of a threat for clouds, it could affect the networks of computers that connect to your cloud environment.

Considering Network Risks The risks to your network involve the threats mentioned above that exploit a vulnerability in your system to gain unauthorized access. Network access is the first step, but it can then allow them to gain control of SAP objects, manipulate or copy data, or prevent access to your SAP platform. An open network connection with the “wild” Internet poses a significant risk. Anyone can find hacking tutorials that will let them exploit that connection. It takes little skill, and uses commonly available resources. Because the network is the first logical step in accessing your systems and data, you’ll want to mitigate these risks. 57

CHAPTER 4

• Injection. Sometimes, attackers can send confusing data to an application and get it to execute parts of that data as code, which allows them to perform any action they want on a machine. Structured Query Language (SQL) is the most common vector for this technique, but plenty of other applications that process complex data suffer from it too, including OS services and lightweight directory access protocol (LDAP).

4 | Network Security

How strongly you mitigate depends on the impact it will have on your organization. Consider the potential losses and associated costs, then look for network defenses that can protect your data for a price that won’t break your back. Cloud providers build the costs of their security countermeasures into their service price, so consider how much security you need and are willing to pay for in your cloud solution. With any cloud provider, some defensive countermeasures are what poker players call table stakes. Table stakes are the defenses and certifications that any provider needs just to play the game. As in poker, you can always buy in for more than just table stakes, but always consider the value of the cards—the data—that you’re holding.

CHAPTER 4

Going on Defense In recent years, two major corporations might surprise you in their focus on security: a large retail chain and a consumer electronics giant, neither of which seem particularly high tech. But both of these companies have been the victims of high-profile security compromises that cost them dearly in terms of both money and reputation. Those attacks exposed how important creating and maintaining strong security controls can be. How can you know if your cloud environment is protected without being burned by an attack? Let’s take a look at what IBM’s cloud-managed services (CMS) offering does to set up its network, shown in Figure 4.1.

58


Customer Enterprise LAN

Internet

Frontbone LAN (FBL)

cks

CMS

N

VP

blo sit 8 n a Tr /2

VLAN 1 /26 VLAN 2 VLAN ... Virtual Firewall

Virtual Server

When a customer connects to our CMS servers, they do so over a VPN tunnel that uses a regular Internet connection. Some customers, if they need extra security, can get an entirely private connection between their site and ours, such as a multiprotocol label switching (MPLS) circuit. That VPN is a “Meet-Me-VPN,” where we configure our end of this Internet Protocol Security (IPSec) tunnel and the customers manage their end. On our side, every connection goes through the IBM Frontbone LAN (FBL). The FBL is a set of networking infrastructure that allows isolation and segregation of customer traffic through virtualization and routing protocols. The FBL creates the entire subset of IP addresses on the cloud environment, trunking traffic to the appropriate VLANs. All client traffic passes through the optional CMS Virtual Firewall (CMSvFW). The CMSvFW, by default, blocks traffic coming over every port, both in and out. Customers enable the ports that they want to use.

59

CHAPTER 4

Figure 4.1 How clients connect to the CMS environment. (Source: IBM)


But unless a customer has multiple VLANs that need to communicate, this level of control may be overkill. Access to their cloud environment is already securely protected by the VPN and/or private MPLS circuit. Then customers get to their cloud implementation. Within that environment, they have free reign over the security controls in place. For example, while all CMS IP addresses that customers use to get to their cloud data are not announced to the Internet by default, clients can change that. CMS security handles all security controls up to and including configuring the VLAN that encloses all their servers. The security that we provide on our networks can teach you about what you need to know when shopping around for cloud providers. In the rest of this chapter, we’ll cover the pieces we touched on above that create a secure cloud as well as actions you can take to secure your end of the connection.

CHAPTER 4

Creating a Defensive Infrastructure With any cybersecurity defense, you want both a depth of defense and a diversity of defense. When an attacker hits your cloud provider, you want to make that person’s task as complicated and as difficult as possible. By depth of defense, we mean that you want to have a lot of barriers in the way: multiple routers, firewalls, security appliances, network switches and bridges, and other servers. If they manage to break through one, they still have several more to go. While they navigate barrier after barrier, your detection routines will have more chances to spot them and react. By diversity of defense, you want different types of roadblocks in place. Not just multiple forms of defense, but brands and operating systems. If you have several routers, make sure they come from different manufacturers. That way, a single flaw in one type of system, equipment, or software won’t leave your entire network wide open. Depth and diversity apply to software, too. We’ll talk about specifics later, but you want multiple firewalls and detection/protection systems. Using virtual LANs and network segmentation, you can add more levels that a potential attacker has to navigate.

60


Firewalls are the traffic cops of the network. They operate on OSI layer 3 and let traffic through based on the IP address of the sender and receiver, the port number that the sender or receiver uses, and the transport layer protocol that traffic uses. Protocols include the transmission control protocol (TCP), over which much of Internet traffic travels. A port number is a communication endpoint on a computer. Applications that receive data from networked sources listen for that data to arrive with a specific port number. For example, web servers listen for hypertext transfer protocol (HTTP) traffic over port 80. When you want to allow certain network traffic, you open the port for that traffic on your firewall. Otherwise, that traffic is blocked, regardless of whether it’s incoming or outgoing traffic. You may have seen a warning on your computer when you first start up a new program: “Firewall has blocked some features of this program” or something like that. That’s a firewall in action. Once you enable and configure the port for that application, that traffic may travel the network to its destination. So long as the destination port is open and configured on the other end, the data transaction can complete. With your SAP application, it’s the same story. You may need to have your cloud provider open those ports on all firewalls in the network for you. SAP provides a full list of the ports that any of their applications use, as well as those that they will never use. (http://www.sdn.sap.com/ irj/scn/index?overridelayout=true&rid=/library/uuid/4e515a43-0e010010-2da1-9bcc452c280b). As you can see, depending on what your SAP implementation looks like, you could have many different ports open. A firewall can be either network or host based. A network-based firewall can be either a hardware or software solution. They guard against traffic on an entire network. Your routers will often have firewall and port-blocking features to allow fine-tuning of what traffic goes through which gates. Sometimes firewalls can do more than just open or close ports. Advanced firewalls can perform load balancing, proxy services, bandwidth control, content control, and error notification. Host-based firewalls manage the traffic in and out of a single computer or virtual machine. Host-based firewalls are usually software applications. They only protect the individual system that hosts it from threats. In a

61

CHAPTER 4

Firewalls


cloud platform, each virtual machine could have its own virtual hostbased firewall. Ask any potential cloud provider about its firewalls and firewall policy. Does it have both network and host firewalls? Can you customize the ports or does one of its administrators have to do that? What protection does it have at the front of its network? Does the provider validate firewall settings regularly, and can you be included in that validation activity? On your end, you should have firewalls protecting networks, systems, and computers that connect to the cloud. Many operating systems come with a firewall by default, but you can purchase other firewall software or appliances to secure your networks. Your IT department can install stronger host-based firewalls or add network-based firewalls to their security portfolio if they haven’t already. Those placements and activities should be based on your own risk profile to protect appropriate systems and data from threats.

CHAPTER 4

Network Segmentation In order to provide each client with a secure environment isolated from other clients, cloud providers can segment their network into smaller subnets. This process enhances both security and performance, but we’re going to focus on the security aspects. A subnet exists apart from all other servers and traffic on the cloud network, isolated because subnets do not broadcast to the other subnets, nor can they view traffic destined for those subnets. Providers can use segmentation to frustrate attackers. Potential attackers want the least amount of infrastructure between their source and destination—your data. By separating traffic by risk and function, you can implement the countermeasures that make sense for each stream of traffic. So how does network segmentation actually work? Primarily, cloud providers segment their networks through network switches, which manage traffic on OSI layer 2, creating VLANs. These switches can route traffic based on MAC addresses, whether real or virtualized, to create networks that operate like a LAN. For virtual machines, the hypervisor comes with a virtual network switch, but that often manages traffic on a single physical server.

62


In a VLAN as in a LAN, all connected servers can broadcast their frames to each other; that is, they announce their presence on the network and are easily discoverable. If you’ve used shared folders on a work network, you know that you don’t need to enter an IP address or search for other computers; you can just browse them. For cloud servers, this lets them know which other hardware devices are part of their resource supply. Let’s take a look at how switches can segment a network, shown in Figure 4.2. Switch 1

Switch 2

VLAN2

Server A Server B Server C CHAPTER 4

VLAN3

Server D Server E Server F VLAN4

Server G Server H Server I Figure 4.2 An example of switches segmenting a network.

Each switch is configured to pass the traffic to and from servers with specified hardware addresses. Traffic in this case only applies to broadcast frames. You can send normal network traffic, such as ping requests, to these servers if they have a reachable IP address. 63


At CMS, each virtual machine has three distinct network segments. One network hosts client-facing—but not necessarily Internet-exposed— traffic, and has all the risks associated with exposed traffic. Another connects to a dedicated internal network segment used only for backup and recovery services. The third is used for CMS administrative services, including service management and resource provisioning. But won’t connecting all three networks to the same server allow an attacker to jump from one to the other? It might, but this separation requires the attacker to jump through another hoop. Attackers would require a high degree of skill to bridge these connections. Additionally, the hypervisor and virtualization technology prevents any attacker from easily jumping servers. We’ll cover hypervisor security in the next chapter.

CHAPTER 4

Security Zoning Within your cloud environment, various SAP business objects and processes have different security needs. For example, your database has a higher risk exposure than your client server. Other connected objects may have different risk profiles. Or you may have different SAP systems that correspond to separate business segments. As such, you can set up different levels of security for each of them. Your cloud provider will usually give you multiple IP addresses to use. These separate IP addresses can resolve to different parts of your SAP system. On an enterprise system, each of the moving parts in your SAP system will just send data to the server on which they reside. But you can change that. Here’s where those different security levels come in. Using firewalls, you can create different security zones. Each zone can differ based on what ports and IP addresses are allowed in. In this way, you could isolate your database or storage replication system if you use SAP HANA, so attacks only affect the systems and objects with the lowest impact on your business. When we set up SAP systems in the cloud, we always segregate them into three zones (see Figure 4.3). The client software—what you and your employees use to perform your work, such as SAP Fiori—connects to a client server. This server is located in what’s called a demilitarized zone (DMZ), a zone that opens to your organization, whether through a VPN or over a private MPLS line. That server connects to an application server 64


through a firewall. That’s zone two, and the software client never touches it directly. Zone 3 is the data storage area, which only allows connections from the application server and our backup software. Because of this zoning, our backup software doesn’t have access to the application or client servers, and vice versa. External CMS Components • Each customer can have 2 Edge connections for redundancy • Supports 1-3 Edge Routed Connection + Inet Connection • To be used for: - Private Line Router(s) - External VPN - External FW - External LB - External IDS - External WAN Accelerator - Other IBM offerings

To internet

Internet

CMS Public Firewall

[Cust-1]-Edge-Outside VLAN

[Cust-1]-EP-DMZ

FBL

CHAPTER 4

vLBF CMS Internal Virtual Firewall

Frontbone LAN Customer 1 Zone 1

Customer 1 Zone 2

Customer 1 Zone 3

Core Block Virtual Router on Core per Security Zone (3 max)

Figure 4.3 CMS security zoning example. (Source: IBM)

65


Our setup follows the principle of least access. Any server only has access to the pieces that it absolutely needs to contact in order to function. Everything else is denied. If you have multiple other SAP pieces, a CRM or financial add-on, say, you can separate them into zones as well. This segmented setup also protects against malicious insiders, disgruntled employees, or people with false credentials who want to do your system harm. By placing your valuable data where no user can directly affect it, you can rely on SAP’s logging and auditing features to track who performs any action in the system. When you talk with any potential cloud providers, find out what kind of zoning they allow. Do you have to bring your own firewalls or will they set up firewall zones for you? How many internal zones and IP addresses do you get? They should be willing to work with you on any of these questions. But as with any security measures above basic, you may have to pay extra for additional network infrastructure.

CHAPTER 4

Identifying Threats We’ve covered the defenses that your cloud provider should have in place to protect your SAP system, but what happens when a threat gets in? Is there anything they can do once an attacker finds a hole? This where network intrusion detection systems (NIDS) and network intrusion prevention systems (NIPS) help. This is a class of software and hardware that monitors and prevents suspicious activity in your cloud implementation, sometimes just alerting an administrator, other times taking an automatic countermeasure. There’s a wide range of what falls into these categories, from antivirus software to hardware appliances that monitor all network activity. The most advanced of these exist as standalone devices connected to the network edges or as a hardware dongle plugged into a router. This way, they have access to the entire local network, not just a single machine.

66


Many of these technologies work by monitoring for known attack signatures or deviations in normal traffic patterns: a piece of compiled code that looks like a virus, a swarm of ping requests that looks like a DoS, or a connected device that doesn’t belong. Any traffic that looks suspicious is flagged so an administrator can deal with it. If it’s a NIPS (or a NIDPS, as it’s sometimes referred to), it can automatically deal with the problem. Again, depending on the device, they can perform other tasks. Some can scan files and server logs to check for file integrity and suspicious behavior. Some take proactive steps to probe firewalls and other network devices for security holes and known exploits. Others scan all active traffic for bad behaviors. While you won’t need to worry about setting this up on your cloud, you should ask what your potential provider does to spot intrusions. Ask about incidents and, if it uses redundant architecture, whether it uses redundant detection and prevention systems as well.

Cloud security isn’t just your provider’s job; some of that responsibility falls on you. Exactly how much depends on the type of cloud, the cloud model you use, and your own business and risk needs. But everyone connecting to a cloud environment that contains sensitive, business-critical data needs to implement a few countermeasures on their client systems. It’s not just the provider who has to put up table stakes to play the cloud game; you do, too. Before thinking about technological countermeasures, your organization should think about and implement a security policy that everyone follows. A good policy builds on the recommendations of international standards (discussed in Chapter 2) and covers both the tech solutions mentioned below and human behavior policy. The best security software won’t save you from Steve in accounting who opens every attachment on every email. Computers in your organization that connect to the SAP system in the cloud will probably also have a connection to the Wild West of the Internet. You’ll need good security controls on all your networked endpoints to prevent access to your sensitive company data. These controls

67

CHAPTER 4

Your Defensive Responsibilities


can include local firewall, antivirus, and even host-based intrusion prevention technologies. Your organization should have security capabilities to identify, prevent, and respond to threats and attacks on its own turf. Finally, you need to protect your connection with the cloud environment. With the help of your cloud provider, you should set up a secure connection appropriate for the type of data, users, and interactions you plan that is aligned with your corporate policies. This may include VPNs, dedicated circuits, or other secure technologies. VPNs use software to create a difficult-to-affect link between two or more computers that does not shut off once enabled. Dedicated lines, such as MPLS circuits, go one step further; they are dedicated network paths that your Internet provider reserves exclusively for your traffic.

How Defenses Prevent Threats Let’s go back to the top, where we talked about threats. As a reminder, here’s the top five that we listed: • Hackers and other cyber attacks

CHAPTER 4

• Malware • DoS attacks • Misconfigurations • Injection How do these network defenses we’ve discussed actually stop these threats from stealing data or harming your SAP cloud environment? Hackers and Cyber Attacks Hackers and electronic cyber attackers initially try to infiltrate your networks from publicly revealed IP addresses. Most of the time, that IP address will belong to individual computers on your site running SAP clients. Because your computers are the first and easiest way in, you need to make sure your security stands up to scrutiny as well. When attackers locate your computer, they’ll check for open ports. If you have a good firewall configuration, they shouldn’t find many. The ports that are open should only allow traffic bound for individual applications. But let’s assume they scan all your ports and find a way in. What’s next? 68


Malware We touched on malware earlier. Hacker intrusion isn’t the most common way a computer becomes infected by malware or viruses. More often than not, a user (not again, Steve!) opens an unsafe executable, which then installs the malware on their computer. Your first line of defense comes from your good practices and internal security policy. Don’t open strange attachments. Ever. Even if they come from your grandmother or a trusted company. Bad actors can forge emails to look like they come from trustworthy sources, a technique called phishing. Some hackers can specifically target individuals or companies with spear phishing attacks. These attacks can be tricky, so you may find yourself opening a suspect file when it seems to come from a legitimate source. Malware today is network aware. It can propagate itself to other computers in your enterprise, either unintentionally or via your activities in your network. So protecting your computer isn’t just protecting yourself; it protects your whole organization. Good antivirus/anti-malware software should catch the attachment before it opens. Because these programs detect malware by signatures, you need to continually update your software to install the latest known 69

CHAPTER 4

If you have the VPN properly set up, they won’t be able to hop into the cloud network. That VPN can control what traffic gets in, as do the firewalls and VLANs within the cloud itself. They have to be pretty slick to slip past the VPN and cloud firewalls at this point. This attacker is going to have to find a way to use this computer to further the attack. Attackers could directly plant a malicious piece of software to either control the machine, record keystrokes, or provide them control. Other than malware (discussed later), they could find an operating system exploit that gives them full control of the computer. At this point, they have control of your computer, which is pretty bad. However, the worst they can do is log onto the SAP system and use the application. If your cloud provider followed SAP’s zoning recommendation, your database server should not be directly accessible from anyone running the client application. To get this far, your attacker needs to break a firewall, take control of a computer via an OS flaw, and use the SAP client to search for information, usually one record at a time.


CHAPTER 4

signatures. Sometimes, malware will be too new to detect and will get through. In this case, see the worst case scenario in the hacker section above. Denial of Service Attacks DoS attacks fill your network pipes with garbage traffic, preventing legitimate traffic from getting through or crashing the target server. While they may not directly steal data, a service outage can cost you valuable time and money, either by denying you access or running up your bill. The attacker may even threaten to continue unless paid a ransom. For these attacks to be effective, they would target your cloud infrastructure directly. While theoretically, they could attack client computers on your network, that has a pretty limited effect—it knocks one person off the system. Attacking a piece of key cloud infrastructure could knock everybody off. As with cyberattacks, the first line of defense is the network infrastructure, including your firewalls. A firewall can block most traffic aimed at it. But even the best firewall has a port open, and a clever attacker could flood that one port to take down that server. This is where the NIDS and NIPS appliances come into play. The devices that monitor the entire network can spot floods of traffic and reroute or block it before it even hits your firewalls. These can even handle those attacks that target the application layer, sending traffic floods that look like legitimate application traffic, such as SAP client requests. With cloud environments, there’s an additional risk from DoS attacks: When the infrastructure goes down, it could affect multiple cloud environments on that infrastructure. Some cloud providers handle this by setting up redundant physical infrastructures and segmenting them logically using VLANs and firewall zones. This way, if a DoS attack manages to take something offline, it doesn’t take everyone down, just a smaller fraction. If you can’t entirely prevent damage, you mitigate it as best you can.

70


Injection In a properly configured cloud environment, the most common injection attack, SQL injection, becomes extraordinarily difficult. The database, if you are using a traditional SAP system, resides on an IP address only accessible from the application server. If you use SAP HANA, then the database resides in memory, which frustrates any would-be injector. If they are even able to gain access to these servers, then they could possibly insert values into the memory, but that requires a great deal of skill. The other opening would be SAP HTML-based GUIs. If a user installs a malicious extension or has other browser flaws unplugged, an attacker could potentially use cross-site scripting to compromise a machine. But with regular software updates and active anti-malware programs, these attacks should be caught before they can affect anything. It’s important to remember that some of the burden of network security will always be on you. Your provider should have an array of defenses in its data centers, but you will need to maintain security on your end. At this point, you should have a good understanding of what the threat landscape looks like for a network and how you can determine whether your cloud provider has a decent defense in place.

71

CHAPTER 4

Misconfigurations Badly configured software can sometimes produce results that look like attack vectors. Imagine a custom piece of software that has usernames and passwords hard coded. If that password ever changes, the software could continually try and fail to log in, which would look like a brute force attack. The danger from misconfigurations arises when they provide an opening for other threats. Default passwords, firewalls with allow-all settings, and poor password policy can make a hacker’s day.


Conclusion

CHAPTER 4

Whereas in the last chapter, we talked about the physical access points to your cloud environment, this chapter covered the virtual ones. Proper network security controls are so important because they prevent any attacker, located anywhere in the world, from accessing your sensitive or valuable data. There’s a wide variety of defenses that you and your provider can implement: firewalls, VPNs, intrusion detection systems, and more. In the next chapter, we’ll talk about the security controls related to the technologies that make distributed computing possible: virtualizations and hypervisors. These technologies provide significant security by themselves by acting as a barrier between the virtual machines running your cloud environment and the underlying hardware. But no software is perfect, so understanding how those flaws can be exploited is key to maintaining a secure environment.

72

CHAPTER 5

In this chapter, we’ll cover the security concerns and solutions related to virtualization technologies, primarily the hypervisor, which is sometimes called the virtual machine manager (VMM). I’ll use the terms fairly interchangeably. The hypervisor is the software, hardware, or firmware component that makes cloud computing work. It runs multiple virtual machines (VMs), each of which operates as a computer or other networked device. These VMs operate as the servers that you connect to with your SAP client, while the hypervisor provisions the computing resources as needed. If you look back at the Open Systems Interconnection (OSI) model in our earlier chapters, the hypervisor operates in a hybrid sort of space. It’s served by both the hardware and the network, though it can also create virtual network devices and manage resources to them. It exists (most of the time) as software running as an operating system, so it has direct privileges to access the kernel, memory addresses, and hardware attachments. Some hypervisors run as hardware solutions, with all virtualization happening in the hardware firmware. The VMs that a hypervisor manages will not have this sort of hardware access directly. They’ll operate as if they were physical machines, complete with processors, memory, hard drives, and operating systems, but those will all be virtual, existing as services managed and mediated by the hypervisor. In a sense, a hypervisor is the operating system on which multiple operating systems can run. To understand hypervisor security, you need to understand how they and their VMs work. Virtualization itself bakes in a lot of the security measures that your cloud environment will need. There are some additional security countermeasures your provider could take, but we’ll get to them after discussing the mechanics of VMs.

73

CHAPTER 5

Hypervisor Security

Note: For most of this chapter, when I discuss hypervisor specifics, it will be based on our team’s experience with VMWare. Not only is it one of the hypervisors that we use, it was the first hypervisor supported by SAP HANA. Much of the virtualization and security discussions below can apply to other hypervisors. When I discuss the specifics about configuring a hypervisor, though, it will use VMWare-specific functionality. You will probably never have to deal with or fully configure the hypervisor in your cloud environment, so this information will be so you understand how it works.

Types of Virtualization Virtualization doesn’t just come in one flavor. It’s been around long enough and is complicated enough for several approaches to have developed. Some of these simulate full hardware, others use application programming interfaces (APIs) or duplicated operating systems (OS) components to isolate individual processes running on the machine. Here are the five types of virtualization and how they differ: • Native. Also called full virtualization, this allows several guest operating systems to run as virtual machines on the same hardware without interacting with each other. The VMM simulates as much hardware as the VM needs, but rarely the full set on the physical machine. Your cloud environment will likely use this type of virtualization.

CHAPTER 5

• Emulation. The VM simulates the complete hardware on the physical machine, with changes as necessary. This lets a user run a different operating system on top of the existing host OS. For high-performance environments, this method can be less than ideal, as it can degrade performance. Figure 5.1 shows how emulation-style virtualization acts as an intermediary between the hosted application and the hardware.

74

Hypervisor Security | 5

1. Application in Native Environment

2. Application in Non-Native Environment

Operating System A Program A

Input Output System

Operating System B

Program A

Graphics Drivers Hardware Interfaces Linked Libraries

Emulator or Just in Time Recompiler

Emulated Input Output System

Input Output System

Emulated Graphics Drivers

Graphics Drivers

Emulated Hardware Interfaces

Hardware Interfaces

Interface

Native libraries Recoded Linked Libraries

Figure 5.1 Normal application behavior versus application virtualization

• Paravirtualization. The guest OS runs directly on the hypervisor without simulating hardware. Instead, all OS-level functions use a special API and are trapped by the hypervisor. These hypercalls—as they are referred to—allow the hypervisor to control any hardware-level commands and ensure the guest OS doesn’t try anything that would interfere with the hardware.

• Application-level. Instead of virtualizing hardware, this method creates copies of components, such as registry files, and emulates all inputs and outputs. You won’t find this in cloud environments much; it’s primarily used for application isolation and interpreted languages such as Java.

75

CHAPTER 5

• OS-level. This type of virtualization creates isolated versions of the host OS in secure environments. This can be used to virtually host multiple servers, to prevent individual applications from interacting with one another, or to live-migrate applications between servers.

5 | Hypervisor Security

In cloud environments, you will generally encounter the first three, though most data centers use full virtualization today. This method lets a cloud provider make the most of all its hardware, distributing resources as necessary to any number of clients. Other virtualization methods limit how flexible a provider can be. With most computer hardware offering direct virtualization support within the kernel, native virtualization offers fast and secure distribution of resources. There are two types of hypervisors that control virtualizations. Type 1 hypervisors run on “bare metal,” directly on the hardware without an operating system mediating their interactions. In essence, these hypervisors are the operating systems. Type 2 hypervisors run on top of a host OS. Both types can simulate real hardware for VMs.

CHAPTER 5

How Hypervisors Work A virtual machine accesses all hardware resources through emulated hardware. Everything the virtual machine interacts with looks like computer hardware to it, but only exists in the hypervisor. The hypervisor has access to all hardware primitives—the most basic processing instructions hardwired in the physical device. All VM processes pass through a virtualization layer, which directs data and computational instructions to the hardware. For an example of how this works, let’s look at memory. When a normal OS boots up, it’s given physical memory addresses that link to machine memory addresses. This is called a page table. In a hypervisor, each VM receives a page table for physical memory address spaces that appears exactly the same. However, any read/write commands targeting physical memory addresses go through an additional translation to get to the machine memory. Take a look at Figure 5.2 to see a diagram of how memory virtualization works.

76


Virtual Machine 1 Process 1

Virtual Memory (VA)

Process 2

Virtual Machine 2 Process 1

Physical Memory (PA)

Process 2

Machine Memory (MA)

This virtualized physical memory space prevents any VM from accessing memory addresses on the machine that may be used by other VMs. It doesn’t even know that those memory addresses exist. Only the hypervisor knows exactly how to map physical memory pages to the machine-level memory pages. A hacker would have to break through the operating system and the hypervisor to get that information. This virtual page table is where a lot of the security magic happens. A hypervisor can remap that table any time, adding or subtracting memory pages as needed. Before handing over a new memory page to a VM, the hypervisor zeroes it out, removing any residual information. If an application tries to remap the page table or access memory outside of the table—common hacker tricks—the hypervisor delivers a fault, which often crashes, freezes, or halts the application or OS. A crashed VM restarts without affecting its neighbors. To protect memory further, many hypervisors randomize how physical addresses map to machine addresses and prevent writable memory areas from running executable code. Together, these measures can stop the ever-popular buffer overflow attack in its tracks. Virtualization provides the same sort of intermediate layer for VM CPU access, I/O interrupt addresses, and hardware MAC addresses. Everything that a VM sees and accesses is software emulating hardware. It

77

CHAPTER 5

Figure 5.2 An example of memory virtualization


looks real to the operating system on the VM, but the hypervisor can control it all; it can provision new resources on the fly and block instructions that attempt to access privileged resources.

Threats Most hypervisors and VMMs offer pretty tight security overall. However, human-written code will always have flaws and those flaws can be exploited by skilled hackers with the right equipment. Some of these threats have only been documented in theoretical papers or shown to be possible by security researchers. Others have caused havoc in the wild. • VM escape. By exploiting flaws in the hypervisor, a VM user could break out of the virtual environment and access the physical host machine. Security company Immunity found the Cloudburst exploit, which used a malicious video file to give full access to the host environment. Other exploits use buffer overflows and page table faults to escape the VM environment. Note: Not many people realize that their XBox 360 runs on a hypervisor. By preventing access to the hardware directly, it can prevent pirated software from running. In 2007, an anonymous hacker posted a buffer overflow technique that gave direct access to the hypervisor, which would allow the user to run anything. Microsoft quickly patched the hole, but vulnerabilities can exist.

CHAPTER 5

• Breaking isolation. This is any attack that allows a bad actor to interact with other VMs that it should not be able to. Attack vectors include IP or MAC spoofing, virtual LAN (VLAN) hopping, and traffic snooping. In 2013, computer science researchers from the University of Adelaide showed how VMs on the same hardware could exploit memory page sharing to recover other users’ encryption keys. • Resource starvation. Either by misconfiguration or malice, a rogue VM running on the same hardware could hog all the resources on the server, denying them to everyone else. In a way, this is a denial 78


of service attack, but instead of blocking network access, it blocks computing resource access. • Accessing hypervisor interfaces. Hypervisor software can be configured, just like any other operating system, Basic Input/Output System (BIOS), or network appliance, through an interface, be it a GUI, command line, or application programming interface (API). Hypervisors use user privileges to control who gets to change settings, because anyone who gains access to the hypervisor, whether via a network or in person, can affect every VM under its command. • Device emulation. Because a VM uses virtual computing resources, all of its I/O devices are virtualized as well. Malware could potentially simulate one of these devices, which would let them transfer data or interact with the host machine. These threats have been proven feasible but are a pretty new line of attack against cloud environments. They require a great deal of skill to pull off, so your garden-variety amateur thief won’t be breaking in using brand new hypervisor attacks. But if you have very valuable data, some bad actors may be motivated to try their hand at breaking a hypervisor.

One of the main ways that hypervisors keep each VM environment secure is by isolating them from every other VM on the same hardware. I’ve already talked about how a VM isolates and controls memory to prevent unwanted interactions, but there are other methods that isolate a VM and its processes from others on the same hardware. There are two types of isolation: temporal and device. Temporal isolation prevents simultaneous VM executions from interacting, interfering, or operating in the same computing spaces at the same time. The example about memory above is temporal isolation. No two VMs have direct write access to the same memory pages at the same time, which prevents data bleed. CPU access can be managed by scheduling CPU time as instructions pass through the virtualization layer. This can ensure that two VMs aren’t running instructions at the same time, but may degrade performance. Fortunately, with modern multi-core and multi-threaded processors, hypervisors can assign individual processing cores or threads to VMs as needed. No two VMs will occupy the same processing thread. 79

CHAPTER 5

Isolation


CHAPTER 5

Sometimes a VM will need to run highly-privileged instructions. Privileged instructions are those that attempt to run commands in the lowest level of execution, called Ring 0. Most applications operate in Ring 3, the highest level, but operating systems typically need to directly access hardware commands. From a security standpoint, a virtual machine running highly-privileged instructions could compromise another virtual machine or the hypervisor itself, as they all run on the same hardware. So hypervisors will typically use a virtualized Ring 0 that examines each highly-privileged command to ensure safety before sending it to the CPU. The hypervisor controls all access to hardware, including disks and attached devices. A VM cannot access any hardware device not specifically assigned to it. Hardware in this case can be a real or virtual device. For example, a VM could connect to a Small Computer System Interface (SCSI) hard drive that exists as a real piece of hardware or as a file on an actual hard drive. The hypervisor maps hardware interrupts to virtual interrupts for each VM. A hypervisor administrator is the only person who can enable hardware for each VM. Virtual machines can’t even see hardware that is not specifically allowed. Hardware can be shared, but the hypervisor manages requests to each attached device. With redundant attached devices, a hypervisor can remap virtual hardware on the fly so that no two VMs access the same device at the same time. Virtual machines on the same server should not be able to detect each other. Because everything passes through a virtualization layer, the software running on that VM has no way of knowing what else is going on in the machine hardware. In fact, it has no idea if it is running on physical or virtual hardware. Figure 5.3 shows how the virtual machine hardware maps to physical hardware through a hypervisor.

80


Virtual Machine Network card

Hard drive

Virtual Machine USB

Network card

Network card

Hard drive

Hard drive

Virtual Machine USB

Network card

Hard drive

USB

USB

Host

Figure 5.3 Virtual hardware mapped to physical hardware through a hypervisor

A VM can interact with another VM if they are on the same VLAN. But in this case, they interact as peer computers, same as desktop computers would on a company LAN. They would not be able to scan shared memory or CPU instructions without escaping the hypervisor. The virtualization layer acts as the primary security control. Rogue VMs will have difficulty bypassing it and getting at the hardware below. That hardware below, especially when used by other VMs, is the big prize. But even if that bad actor manages to crack through this layer and access a memory space, it could be reallocated at any time.

While the hypervisor is designed to prevent malicious behavior from rogue VMs, we all know that no software is ever bug free; your operating system wouldn’t be nagging you to install patches every week or so if it was. Hypervisors have security flaws discovered all the time. We hope it’s white hat security researchers who get there first, but sometimes it’s the bad guys who find the hole. Just having a virtualization layer may not be enough. Fortunately, many hypervisors implement VM introspection. This process usually sets up an additional VM to monitor the states of other VMs. The informant VM looks for suspicious activity, whether it’s in working memory, system events, or I/O device behaviors. Some introspection VMs will even watch live processes to catch anomalous flows. 81

CHAPTER 5

Introspection


Introspection can work as an intrusion detection system, which I talked about in Chapter 4. Instead of it being host based, where is can be subject to attacks, or network based, where it has less visibility on individual machines, it lies somewhere in between. It has full visibility of local processes without being a part of any guest system. These tools have been able to spot rootkits, which operate on the highly privileged Ring 0 level of CPU kernels, that would otherwise go unnoticed. The downsides to introspection include the data that it inspects. Most of what it has access to is very low-level information. Introspection has to be able to reconstruct high-level attack information from this semantic gap, as it’s called, though it can be difficult. Hardware-enabled virtualization can help bridge this gap, as can micro-VMs for individual computing tasks, but making this process easier is an active problem in computer science research. As we have discussed throughout this book, if you have sensitive data, you may not want or may not legally be allowed to let another VM scan all your computing activity. Cloud providers often promise not to look inside your VM operation, which is exactly what introspection does. It’s a balancing act; determining whether your risks and business needs necessitate closer monitoring is something you’ll have to determine with your provider. When shopping for a provider, ask them how they would spot a rogue VM. That will help you find out what kind of introspection they use and whether it would help your operation or potentially expose your data.

CHAPTER 5

Controlling Resource Usage Resource usage issues come from two main problems. In the first case, your SAP system has been allocated the wrong amount of computing resources—too much drives up costs and too little means your system cannot run effectively. The second threat to resources comes from a rogue VM on the same physical hardware that sucks up all the available infrastructure, leaving your VM without any processing and/or memory available to perform the transactions that you need to complete. This can look like performance impacts within your applications. Determining exactly how much computing power your VMs need from the outset can be challenging, especially if you’re not already run-

82


ning a system with the exact configuration that you’ll move to a cloud environment. SAP offers a Quick Sizer tool online (http://service.sap. com/quicksizing) that can give a rough estimate of what sort of environment you’ll need. To measure size, it uses SAP standard application benchmarks (SAPs). 100 SAPs are equivalent to 2,000 fully processed order line items per hour. The exact resources that these processes require differ based on the hardware that your provider uses. SAP maintains regularly updated benchmarks that detail how many SAPs a specific hardware configuration can handle. Your provider should have a fair understanding of the SAPs capacity of its hardware, especially if it bills itself as an SAP specialist. To simplify things further, SAP recommends and enables what’s referred to as T-shirt sizing. That’s how we provision most systems at IBM. When you come to us with the specific amount of SAPs your system will need, we fit you for the T-shirt that contains it, plus a little more for growth. • XXS for up to 1,100 SAPs • XS for up to 2,200 SAPs • S for up to 3,300 SAPs • M for up to 6,000 SAPs • L for up to 12,000 SAPs We at IBM use T-shirt sizing for initial greenfield implementations, though ultimately, we size all customer VMs based on their needs. Unless you have a bare metal server and own the whole box, most cloud environments charge by how much computing resources you use. The VMs are sized by T-shirt sizes, but the processing required moment to moment can change. Some cloud providers may overcommit computing resources to several VMs as a whole instead of individually. This is a bet against all VMs using all available resources at once. It’s a risk many providers and customers take because it allows them to efficiently use all hardware resources and provides cost-effective use of services. A downside can occur in the unlikely event that all guest VMs max out their available computing resources at the same time. This would lead to a degradation of performance for all VMs on that hardware. 83

CHAPTER 5

• XL for up to 20,000 SAPs


SAP warns against overcommitting computing resources to your SAP VMs. Overcommitting allocates more memory to a VM than is physically available. SAP systems tend to use a lot of memory, so if you or your provider allocate the memory, the SAP system will swallow it up. You can avoid this issue by limiting allocated memory to the size of the VM. Once your T-shirt size is in place, be careful about changes. Even small ones can have large repercussions. We worked with a utility company that wanted to make a change to its G/L structure that added an extra layer of specifications to transactions. But this change tripled the size of its G/L, which tripled the resources its SAP system needed. Your VMs could be starved for resources by external bad actors as well. A rogue VM on the same hardware can exploit the shared resource pool by maxing out their computing usage. This could deny your access to these same resources if the provider uses any sort of resource overlap or overcommit policy. A rogue VM doesn’t necessarily need to be malicious; a misconfigured system can cause the same effects. SAP Tools for Resource Management You aren’t at the mercy of your hypervisor when it comes to managing computing resources; SAP includes a memory management system to improve and control usage. Every SAP work process has two memory spaces: a roll area and extended memory. The roll area is the initial amount of memory assigned to a work process. Extended memory is the amount of additional memory that the process can use when it needs it.

CHAPTER 5

Note: A memory management system may in fact increase the total amount of memory your SAP system requires. That’s because it creates a swap file to move data between extended memory address spaces. This swap file becomes a temporary home for any overflow data.

This process is referred to as bursting. The roll area is the memory needed for regular operation. The extended memory should be 10 to 15 times what the roll area is. Depending on the process, it might even need to be 100 times the amount. Complex reporting processes use bursting often. The process lies idle until a user runs a reporting transaction that 84


needs to pull data from several unrelated areas to create hundreds of reports. In that case, the report process bursts into the extended memory so the reports can be generated efficiently. Be careful about overcommitting to your burst memory. You may think you want your key processes to have all the memory that they need. However, extended memory multiples in the hundreds, even thousands, can monopolize memory resources or force a VM to access memory that does not exist, which would crash the system. If you find that your SAP system performance begins to degrade while also maxing out the available memory, you may have a memory leak. Even in the best software, memory misallocations happen when data that is no longer used cannot be accessed, but remains in memory. These resource leaks can both slow down software performance and ramp up resource usage, which in some environments can cost you money. To spot a leak like this, SAP provides the ABAP (Advanced Business Application Programming) Memory Inspector. This uses a standalone transaction (S_MEMORY_INSPECTOR) to take a snapshot of your memory usage at a point in time. By comparing multiple snapshots over a period of time, you can find the baseline memory usage and spot times where your memory usage deviates from that.

How Hypervisors Prevent Threats Now that you’ve got a sense of how hypervisors and virtualization work to protect you, let’s return to our list of threats at the beginning to see how these security measures prevent them from doing damage. For reference, here’s those threats: CHAPTER 5

• VM escape • Breaking isolation • Resource starvation • Accessing hypervisor interfaces • Device emulation

85


VM Escape Let’s say that an attacker has access to a rogue VM on the same hardware as one of your server VMs, whether it’s the client, application, or database servers. To break out of their virtualized isolation, attackers need to find some critical flaw in the hypervisor operating system that lets them either access hardware addresses directly or run arbitrary code on the hypervisor level. Note: In 2014, security researchers at CrowdStrike found a VM escape flaw in hypervisors based on QEMU, which includes Xen and VirtualBox. They called it VENOM, Virtualized Environment Neglected Operations Manipulation. This flaw allowed an attacker to cause a buffer overflow in the floppy drive driver, which could cause the hypervisor to run the overflow as code. Even if those VMs didn’t have a virtual floppy drive configured, they still had the hooks for it. Fortunately, no in-the-wild exploits of this bug have been found, and the vulnerability has since been patched.

How difficult is this? Extremely. Hypervisor vendors patch known security holes as fast as they can—the protection provided by virtualization is their bread and butter, so any exploit threatens their livelihood. Finding a new exploit (a “zero-day” exploit, in security lingo, as it has been known about for zero days) is extraordinarily difficult. It requires a high level of both skill and resources. In effect, virtualization is its own protection against VM escapes. You and your provider can take a few steps to minimize the avenues available to this attack: CHAPTER 5

• Keep your hypervisor patched regularly. Vendors provide fixes for known exploits regularly. If you don’t have control over patching, find out your provider’s policy and time frames for the activity. We’ll talk more about maintaining software in Chapter 8. • Limit the resource-sharing features your VM and hypervisor have enabled to the minimum you need to operate. The fewer resources that are exposed, the smaller the attack surface is.

86


• Consider intrusion detection/prevention technologies. Whether these are host-based solutions that you install yourself, hypervisor-based software that your provider maintains, or network-based technologies, they can often spot patterns of malicious behavior before they can cause harm. • Only install the software that you need on any given VM or host. The more software installed, the more potential security holes can be introduced.

Breaking Isolation A similar threat to VM escape, VMs that break isolation can threaten other VMs in a more indirect manner. Where VM escape attacks give the VM access to the hypervisor, isolation breaks let them interact with another VM when they shouldn’t be able to. These attacks use spoofed information—MAC addresses, IP addresses, and more—to access and control other VMs. In 2012, security researchers published an attack that faked VMWare configuration files. If the files were referenced by a deployed VM, the attacker could access all hard drives associated with the hypervisor. This vulnerability has since been patched. But these flaws happen and can be found. We always hope it’s the good guys who find these holes, but if it isn’t, then cloud environments could be vulnerable. As we mentioned in the section above, finding new holes in hypervisors takes a huge level of skill. If you get hit by an attack that breaks isolation, you’re dealing with serious hackers. The best defense against these kinds of attacks is to keep all software updated. Your provider should have a change management plan in place to ensure that high-priority patches are applied to the hypervisor as soon as possible. See Chapter 8 for more information on software updates and change management.

87

CHAPTER 5

For protection against rogue VMs on other physical hardware, you have the network security to rely on, including the firewall infrastructure, VLANs, and other network appliances. See Chapter 4 for more information on network security.


CHAPTER 5

Resource Starvation As I’ve mentioned, one of the benefits of cloud computing is the ability to scale up resource usage quickly and efficiently. But the downsides of quickly and efficiently is resource sprawl and overlap of hardware. Both of those can lead to resource starvation. The good news is that you or your provider can head these off before they can become a problem. To prevent resource sprawl, your IT department needs to have a policy on VM growth that it follows every time. Just like your organization has a policy around capital expenditures for physical resources, it needs one to handle virtual resources. Your SAP admins need to maintain reasonable extended memory limits. And your provider needs to have enough hardware to fill the needs of your systems when their server loads grow. To prevent hardware overlap from causing starvation, your provider could allocate only enough computing resources to the sum total of VMs as they have available in hardware. But this scales poorly—it’s a one-toone ratio—and will cost more. All VMs will not be using all available resources at all times. Your provider should be monitoring VMs to see which systems may utilize the maximum resource consumption for long periods of time. It could point to VMs with misconfigured software or intrusions from malicious software or users. Another way an SAP system could starve would be if a user continually ran processing-heavy transactions, like complex reports. But this isn’t something that the hypervisor can stop; these attacks fall under the purview of user access controls, which we’ll talk about in Chapter 7. Accessing Hypervisor Interfaces These interfaces could give an attacker the keys to the kingdom; they can provision and deprovision whatever they want for any VMs managed by the hypervisor, including creating and destroying VMs. Cloud providers need to make these interfaces as difficult to access as possible. What that means depends on whether your cloud environment is managed by the provider or by your IT team. If the provider manages it, it can place the interfaces on networks only accessible either in the data center building or by trusted computers

88


Device Emulation Because the devices on VMs are virtualized, if attackers can make a VM think a rootkit is a device driver, they can run their own code at a highly privileged level. Device driver code can often be the shoddiest in an operating system, so attackers can exploit those vulnerabilities, too. A hypervisor doesn’t need to create virtual devices for an attacker to exploit it; all the attacker needs is the possibility that the device can be created—a virtual plug in the virtual motherboard. So the first thing to do is eliminate the possibility of devices that will never be used. You probably won’t need a floppy drive there, so why provide the option? Eliminating these openings hardens a VM and shrinks the attack surface. For necessary devices, a hypervisor can isolate them into separate domains. This is an extra step and not necessary, but can provide extra security. These protections may increase the latency on I/O traffic, but if you or your provider have concerns about device exploits, it may be worth it.

89

CHAPTER 5

under its control. It becomes a problem for physical and network security, which we covered in Chapters 3 and 4. Beating either of these types of controls requires advanced skills, which reduces their likelihood. An attacker would either have to bridge networks or gain access to the physical data center, both highly difficult with proper security controls. If you manage the server yourself, accessing these interfaces should have the same sort of network controls as accessing your cloud environment. You can have VPN-protected connections, which would prevent most intrusions. Refer to Chapter 3 for more information on how network security will prevent intrusion into your cloud environment. Suppose that the attacker does manage to get to the hypervisor interfaces. The attacker would need additional user permissions to log in and go to work. Any hypervisors that can be accessed by an intruder should have some sort of username/password for access at the very least. The attacker would need to steal or forge credentials or exploit a flaw in the interface, both of which should be very difficult.


Finally, if a device is emulated or compromised, VM introspection and logging should catch it. A misbehaving device leaves a trail of transactions that won’t be able to be traced back to a valid program.

Conclusion

CHAPTER 5

Virtualization itself provides the bulk of the security for your virtual machines. But no software is perfect, and hypervisors are no different. Our dive into the mechanics and flaws of hypervisors should arm you with the right questions to ask when shopping for providers. Find out what software they use—at most IBM data centers, we use VMWare. It’s powerful and has been hardened over time by testing and plugging holes. Up next is encryption. These technologies help protect data when it’s at rest on a hard drive, in transit over a network, and in use by an application. You use encryption every day when you browse websites using HTTPS technologies. We’ll cover all the various technologies that you can add to your SAP system, as well as the ways that attackers will attempt break your encryption to gain access to your valuable data.

90

CHAPTER 6

Encryption The previous chapters all discussed how cloud providers can prevent an attacker from accessing your data. But what happens if an attacker does gain access? Does this meant it’s game over, data’s stolen, start running damage control? Not necessarily. This is where encryption can help you. In this chapter, we’ll cover what encryption is and how you can use it to protect your cloud-based SAP system. We’ll discuss the major encryption technologies and proper key management, because without a key, you can’t access or see your own encrypted data. Plus, we’ll talk about the encryption options available in an SAP system and how you can add additional controls. Throughout this discussion, keep in mind that you may need to implement the encryption scheme that you decide on, and you may not want your cloud provider to have direct access to your data.

Encryption technologies pass data through a math function that scrambles it so it cannot be read in clear text. The jumbled result is sometimes called ciphertext. To get the original data back, you need to decrypt the ciphertext using another math function. To get both functions to work correctly, you need a key, which is a piece of information that indicates how to encode ciphertext and then process the encrypted information to get the original back. Within the Open Systems Interconnection (OSI) model, encryption can happen at any layer between 3 and 6, which are the presentation, session, transport, and network layers. This process can happen at any of these layers because encryption is just a data manipulation, which can happen at multiple layers. Decryption almost always happens at the presentation layer, layer 6. Encryption schemes can use symmetric or asymmetric keys. Symmetric key encryption uses the same key to encrypt and decrypt data. 91

CHAPTER 6

What Is Encryption?

The functions that perform encoding and decoding use reverse processes. Asymmetric key schemes use one key to encrypt and another to decrypt, as shown in Figure 6.1. Having one key only allows you to code the data one way. Pretty Good Privacy (PGP) is an example of an asymmetric open source privacy program. You provide public keys for anyone who wants to send you a message, but keep a private key for yourself to read those messages. Any person sending you secure messages will not be able read messages others send using that public key.

Plaintext

Sender

Ciphertext

Plaintext

Encrypt

Decrypt

Recipient’s Public Key

Recipient’s Private Key

Recipient

CHAPTER 6

Figure 6.1 How asymmetric key encryption works

Even before computers, people were encrypting messages. In ancient Sparta, the military would use scytales to send important messages. Both parties had a rod of the same length and diameter—a symmetric key. The sender would wind a leather strap around a rod and write a message across the strap—the encryption function—so that it could only be read if wound in the right way. Anyone else seeing the leather would just see a jumble of letters. Computers made encryption more powerful, but they also made it easier to break weak encryption using brute force attacks. Brute force attacks try to guess a key by trying all possible versions. These attacks could theoretically break any encryption given enough time, but they can

92

Encryption | 6

be computationally expensive. A key size of n bits has 2n possible keys, and any attacker trying to brute force crack that encryption has to try each one. Note: The Sunway TaihuLight, located at the National Supercomputing Center in Wuxi, China, was the most powerful non-distributed supercomputer as of June 2016. It has a theoretical peak performance of 125,436 teraflops, which is equivalent to one million million floating point operations per second. To crack a 128-bit key, it would take this computer over 86 trillion years to try every combination.

Of course, the actual security depends on the algorithms being used to encode data. Symmetric keys can often maintain their security so that 128-bit symmetric keys require the full effort of trying all 2128 keys. Asymmetric keys, which are used in most data transmission technologies, generally use longer keys to provide the same protection. For example, the RSA algorithm, one of the earliest secure data transfer methods and still widely used today, requires a 3072-bit key to get the same protection as a 128-bit symmetric key. Depending on the encryption function, you see additional qualifiers applied to the final encrypted data:

• Salted. When ciphertext has been salted, it means the encryption algorithm used an arbitrary number as part of its key input. This prevents bad actors from using previous communications to fool an algorithm. Asymmetric key algorithms sometimes salt communications to frustrate dictionary and rainbow table attacks.

93

CHAPTER 6

• Hashed. A hash function maps data of arbitrary size to a fixed size output; that is, it scrambles any given data so that the ciphertext always has the same number of characters or bits. The resulting hashed data cannot be easily reverse engineered, as you can’t know how many characters/bits were in the original data. Good hash algorithms always create the same hash value off a given string, create vastly different values based on small changes in the original, and do not create the same value from two different inputs.

6 | Encryption

Data has two primary states in which it can be encrypted, and each can use different technologies to encrypt data. Data at rest refers to any data stored somewhere while it’s not being used. Data in transit (also called data in motion) refers to data being transferred from one location to another, usually over a network. There is a third category, data in use, but it is largely similar to data at rest, except that it requires security within the application accessing it.

Encrypting Data at Rest

CHAPTER 6

Data at rest lives on a storage system or disk somewhere. It’s static until accessed or transferred to another location. Much of the other security measures we’ve discussed in previous chapters protect data at rest—the physical access controls, network security, and the hypervisor all prevent unauthorized access to data at rest. But it can still be at risk if a bad actor manages to bypass your provider’s other security controls. To protect this data—in the case of an SAP system, the database, or persistent storage for SAP HANA—you should encrypt it at all times. When this data is encrypted, it doesn’t change until some application tries to read or write to it. When that happens, if the application doesn’t have access to the key and encryption algorithm, it can’t read the data; it just looks like garbage. If it does have access, then it’s business as usual. Depending on the computing resources available, you may experience a slight performance slowdown, but likely not much. We don’t recommend that our customers encrypt their entire hard drives. One small corruption could cause the whole drive to become unreadable. Without a recent backup, several days— maybe everything— could be lost. Instead, we advise our clients to apply encryption on a per file or directory basis. You safeguard the most important data, while only risking smaller losses from corruption. Plus, it doesn’t make sense to encrypt most application and operating system files; there’s nothing sensitive about them. You can be more specific about how you encrypt data; instead of encrypting entire files, you can just encrypt sensitive fields, such as passwords and credit card numbers. This way, anyone accessing it needs to have an additional key to read it. This can help prevent accidental disclosures

94

Encryption | 6

from data leaks or accidental exposure of sensitive information in required disclosures, such as complying with a lawsuit or regulation. Most standards recommend strong encryption, and employ cryptographically secure algorithms using large keys. AES and RSA are considered secure algorithms, while 256 bits (or the equivalent, depending on how the algorithm creates keys) is considered a good key length. While not guaranteed to be completely secure, these algorithms have existed for a while and have been publicly tested while remaining secure. As we saw in the example in the previous section, even 128-bit encryption is almost uncrackable. Larger keys, such as 256-bit keys, protect your data against massive future gains in computing power, such as those promised by quantum computing. Encryption is not a standardized, uniform process or mechanism; the exact manner in which data becomes encoded depends on the algorithm. Some algorithms offer greater protection for the same key lengths or may be recommended by certain standards. Let’s take a look at some of the more common cryptographic algorithms and specifications used to encrypt data at rest today.

95

CHAPTER 6

Advanced Encryption Standard (AES) The Advanced Encryption Standard (AES) is one of the most popular algorithms thanks to its status as the US government standard. It’s the only publicly available encryption algorithm that the NSA uses to secure top secret information. The algorithm uses the Rijndael cipher with either 128, 192, or 256 bit symmetric keys. It’s a fast and secure algorithm, designed as a substitution-permutation network. The basic way that the algorithm works is that it runs several rounds of modification. It arranges the data into several four-by-four matrices of bits. For each round, it derives a round key from the primary key based on a specific schedule. In the first round, it adds the round key to the data using a logical process called a bitwise XOR. Each bit of the round key is compared with the associated bit in a block of the data. If they are the same, the resulting bit is a 0. If they are different, the resulting bit is a 1. This process is shown in Figure 6.2.

6 | Encryption

0110 0011 bitwise XOR 0101 Figure 6.2 An example of a bitwise XOR operation

CHAPTER 6

In the middle rounds, it runs the same four steps: first, it substitutes each four-bit byte with another byte as defined in a lookup table, which it generates. Next, it shifts bits within the four-by-four matrix; the first row stays the same, the second row shifts each bit one left, the third shifts them by two bits, the fourth shifts three. If shifting would cause them to jump up a row, that bit stays put. Then, it takes each column in the matrix and mixes them up so each bit in that column affects the output bit for all four. Finally, it adds the round key to the data as in the first round. The final round works the same as the middle rounds except it omits the column mixing step. For 128-bit keys, AES performs 10 rounds of processing. For 256-bits, it’s 14 rounds. This is how solid encryption algorithms work; they perform multiple transformations on data that cannot be reconstructed without the key. With AES, there’s a non-linear substitution step, diffusion—small changes will affect much of the data—from the column mixing, and interactions with key data from each step. Security researchers have published theoretical attacks that may not have practical applications, as well as side-channel attacks, which exploit information not related to the key or algorithm such as power consumption or electromagnetic leaks. While many of these attacks lower the time to crack encoded text or recover keys, they rarely enter the realm of the practical under current technologies. RSA The RSA algorithm, created by and named after Ron Rivest, Adi Shamir, and Leonard Adleman, is one of the first encryption algorithms that used a public key. It was first described in 1977 and is still in wide use today. While asymmetric key systems like RSA often protect data in motion, it can and is still used to secure static data. 96

Encryption | 6

Unlike AES, which uses an arbitrary key of 128, 192, or 256 bits, RSA generates both keys based on two very large prime numbers, a and b. The algorithm for RSA is much simpler than AES—both the encryption and decryption functions calculate the cipher and decrypted text based on exponential transformations around the two primes, plus a number that is coprime—that is, the only common factor of the two numbers is one— to the product of each prime minus one ( (a X b)-1 ). The values of these transformations are then limited by a modulus equivalent to the product of the two original primes. That is, when its value reaches that number, the value “wraps around” back to zero. The exact math of how this function works shouldn’t concern you too much. Just know that because the algorithm uses a simpler math algorithm and theoretically guessable prime numbers, the keys that it uses need to be much larger to achieve equivalent security, usually between 1,024 and 4,096 bits. That security will be much worse if you (or the algorithm) select small numbers for your primes or for the additional factor. To improve the security of this encryption algorithm, most implementations also salt the data to be processed with some amount of random data. That way, attackers can’t just use the public key to encrypt data to see if it matches ciphertext that they want to crack. With properly generated prime numbers, RSA can be a strong method. RSA encryption algorithms tend to be too slow for heavily used data, such as your SAP database. Instead of directly encrypting data, some users have encrypted the keys for their primary encryption method using RSA. I’ll talk more about key encryption and key management later on in this chapter.

Encrypting Data in Motion

97

CHAPTER 6

Once an application sends data somewhere—across a network, from a file to an interface, or from a database to an application—it becomes data in motion. When you access your SAP system in a cloud environment, your data travels between several different endpoints: between your desktop client and the client server, between the client server and the application server, and between the application server and the database server. With good network security controls, those transfers should be safe and secure.

6 | Encryption

CHAPTER 6

But as we know, there’s always a risk of security controls failing or being subverted through negligence, software flaws, or hacker skill. In fact, when data is in motion, that’s when it’s at its most vulnerable. When data is at rest, you know that there is one point of entry—the machine on which it’s stored. For data in motion, understanding the potential parties that have access is more complicated, especially once it leaves the confines of a data center. To get from there to your client, it passes through any number of routers, on which packet sniffing malware could be installed. Or a hacker could insert false gateways between you and your destination using man-in-the-middle attacks or proxy servers. Think of it like traveling; at rest in your house, you have pretty good control of the access points. However, when you go traveling on planes, trains, and automobiles, you expose yourself to a variety of other spaces, some of which might even be open to the public. Your luggage (metaphorically, your data) could be affected by unknown malicious actors. Data in motion has two primary risks: privacy, that an attacker could read the transported information, and integrity, that an attacker could change it. The risks to privacy operate the same as for data at rest. The risk to integrity, however, is a different animal altogether. Imagine sending a data request to a server: The request includes authorization tokens, the information you want, and where the server should send that information. Now imagine attackers change the second part so that it is sent back to them instead. In fact, if they can capture a valid communication in the raw, they can modify it at will, gathering any information they want. To prevent this, you can encrypt communications between two network endpoints. This encryption works a little differently than at-rest encryption because the two endpoints may not know each other and do not have a shared symmetric key or paired asymmetric keys beforehand. Instead, in-transit encryption uses one-time keys generated for each transaction shared using a public/private key pairing. SSL/TLS Secure Socket Layer (SSL) and its successor, Transport Layer Security (TLS), protect networked communications from eavesdropping, theft, and modification. The term SSL is used interchangeably with TLS at this point; SSL 1.0 through 3.0 have been compromised and should not be

98

Encryption | 6

used to secure critical data. For this section, unless I’m talking about the deprecated SSL versions, I’ll use TLS to refer to this method. TLS relies on two steps of encryption. The first is an asymmetric public-private key pairing. When a client contacts a server and requests a TLS-protected connection, the server begins what’s called a handshake. Like real-life handshakes, this procedure can become complicated. This handshake includes a list of the hash functions and encryption algorithms it supports (which may include AES or RSA, discussed previously in the chapter). The server picks the ones it wants to use and sends back that information with its certificate, which is its public encryption key issued by a valid authority. At this point, the handshake has the means for secure asymmetric encryption. But the connected parties need to be able to send secure messages back and forth; the asymmetric key only allows one-way encryption. Plus, anyone listening could have intercepted the handshake message with the public key. To guarantee that both sides send secure data and that each message comes only from the two connected parties, they need an extra step. That’s where the protocol turns to a symmetric encryption method. Using the certification as a key, the client picks a random number, encrypts it, and sends it back to the server. Nobody but the client and server know this number; our eavesdropper doesn’t have the private key to find out what the random number is. Using that number, both client and server generate a symmetric key. Now they can talk among themselves without the danger of nosy neighbors.

TLS encryption works for any data connection, including all steps within an SAP system. Data flowing between a client and its server, then to the application server, then to the database server can all be securely 99

CHAPTER 6

Note: SSL has fallen out of favor as the primary method to secure communications between servers. Thanks to significant attack vectors such as POODLE and the Heartbleed vulnerability, SSL development ended at 3.0. If you are using a web interface to access your SAP system, make sure that the browser that you use is current and supports TLS 1.1 or 1.2.

6 | Encryption

transferred over TLS-secured connections. TLS applies to SAP functions that use HTTP, LDAP, or P4 protocols to transfer data. For dialog or Remote Function Call (RFC) operations, SAP uses Secure Network Communications (SNC), a protocol that integrates cryptographic protection with SAP NetWeaver Single Sign-on or other external security products. You can configure your levels of protection as well as integrate with functions that SAP does not provide, such as smart cards. IPsec Whereas TLS and SSL encrypt data packets sent between a client and server within the two applications on the presentation layer, Internet Protocol Security (IPsec) encrypts each IP packet sent from a connected device, whether that device is a host—a single computer on a network— or a network security gateway. Because it encrypts all traffic, it can protect all network traffic between your client connections and the cloud server. Application traffic is automatically secured, regardless of whether you also have TLS/SSL enabled for your SAP connections. When you secure your connection with a VPN, you are probably using IPsec. IPsec shows up in other implementations, but this is the most common one for the majority of Internet users. IPsec uses three protocols to ensure data integrity and security: authentication headers, encapsulating security payloads, and security associations.

CHAPTER 6

The authentication headers guarantee that the packet being sent is both authentically from the IP address that it claims and the same data that was sent. It does this by including a checksum value that verifies the included data. Using an optional technique called the sliding window, these headers can improve defenses against replay attacks by incrementally increasing the number of packets up to a maximum, then sending a number of packets equal to the number of packets that the receiver has acknowledged receiving. The encapsulating security payload provides authenticity, integrity, and secrecy for some or all of a single IP packet, depending on the mode in which the protocol operates. This protocol provides all encryption for packets, plus a unique, unforgeable fingerprint that ensures the packet cannot be tampered with. 100

Encryption | 6

The security associations are the encryption algorithms, keys, and other parameters that the connection uses to secure data. A security association operates in one direction only, so a secured connection between two parties has two of these. To ensure that these associations themselves are secured, the connected parties establish them using some sort of preshared secret, public/private key pairings, or other negotiated key/secret exchange. IPsec operates in either transport or tunnel mode. In transport mode, IPsec encrypts only the packet payload and leaves the headers intact. In tunnel mode, it encrypts the whole IP packet and wraps it in a new packet with a new header. Tunneling enables both VPNs and remote access functions.

Key Management

101

CHAPTER 6

As we’ve seen, much of the power of encryption comes from the key, a piece of data used to determine how the target information will be converted into ciphertext. Without it, no one can read the encrypted data. You need to perform a bit of a balancing act, securing the keys to prevent unauthorized access while making sure those keys remain available to applications that need them. So managing these keys is, pardon the pun, key to the security of your cloud data. In a cloud environment, how you store and manage keys can grow complicated quickly. You may have multiple SAP systems, each with a database stored in a distributed environment. Each encryption method has its own key, either a symmetric key or a public key/certificate and a private key. Authorized applications need to have access to those keys, while protecting them from bad actors. That’s right, you need to protect your keys just like any other sensitive data. But preventing unauthorized access to keys is only part of key management. You’ll need to manage a key’s life cycle. The longer your applications actively use a key, the more chances that it can be compromised. Like passwords, keys need to be replaced after some period of use. As your encryption profile grows and applications begin sharing keys to access the same data, you increase your risk of losing a key in active service.

6 | Encryption

You have a few options in how you manage your keys. You could store them on a separate server from the encrypted data and the application accessing it. You could encrypt and back up the keys themselves. Or you could use a third-party key management system so that you don’t need to worry about protecting your keys. Each of these has benefits, and need not be implemented exclusively. It’s pretty easy to overlook how dangerous it is to store a key on the same server as the data that it protects. If attackers manage to bypass security to the point where they can inspect the encrypted data, they can inspect the key located on the same server behind the same compromised defenses. To add another layer of security, you can put that key on a networked location somewhere, perhaps even in a different virtual LAN (VLAN) or security zone, providing an additional layer of security. Your keys are data, the same as your SAP database; you can encrypt them, too. Key encryption is sometimes referred to as wrapping encryption as it wraps around the other mechanisms. To access these keys, applications need to have access to a master key. Now you may ask, “Should I encrypt this key, too?” That leads to an infinite stack that’s encryption all the way down. Instead, you can make sure that this key is strongly protected using protections such as two-factor authentication and network security controls. This brings us to third-party management systems. If the above protections around keys sounds like a headache, you’d be right. But clever companies have come up with key management solutions that manage your keys transparently. A good key management system can do many of the following: • Store and protect multiple encryption keys for multiple encryption schemes • Transparently allow permitted applications and users to access encrypted data • Automatically and transparently expire and replace keys at the end of their life cycle

CHAPTER 6

• Generate new random keys when necessary • Prevent accidental exposure of keys, even to unauthorized organization members and your cloud provider

102

Encryption | 6

• Save deprecated keys to access archived data • Provide easy key recovery in the event that something goes wrong Note: To standardize the key management process, IBM and a consortium of other vendors created the Key Management Interoperability Protocol (KMIP). This standard helps integrate encryption systems and key management processes so that they all work seamlessly. We implemented this standard in the Security Key Lifecycle Manager, the industry’s first KMIP-enabled solution.

As with encryption as a whole, the amount of time and money you put into securing and managing your encryption keys should depend on your business needs and appetite for risk. Key management is an additional security control on encryption, though poor key management will absolutely negate the power of your encryption mechanisms.

Encryption Within SAP

103

CHAPTER 6

The SAP family of products includes some default encryption. In a software package that handles business-critical information across multiple verticals, encryption is a must. SAP recognized that and built in standard cryptographic measures for both at-rest and in-transit data. Many customers find that these measures provide enough security for their cloud environments; others may want to implement additional controls. Refer to the security manual for your specific SAP product for more details. Encryption security in SAP operates the same in the cloud and on premise. While many of the SAP technology platforms support encryption, you’ll probably centralize it through SAP HANA. Information about encryption for SAP HANA comes from The SAP HANA Security Guide, currently available at http://help.sap.com/hana/SAP_HANA_Security_ Guide_en.pdf. Other products, such as SAP NetWeaver and SAP ERP, have built in encryption, but SAP HANA provides the most thorough controls.

6 | Encryption

While SAP HANA uses an in-memory database, it still saves some of that data to a persistent storage area on disk. SAP HANA can automatically encrypt that stored data using the AES-256-CBC algorithm. That means it uses AES encryption algorithms—which we discussed earlier in this chapter—with a 256-bit key in cipher block chaining (CBC) mode. All pages written to the disk area will be encrypted, then transparently decrypted when loaded back into memory. The keys used only remain valid for a certain number of savepoints, then are automatically changed. The in-memory database portion will not be encrypted to maintain smooth performance. In addition to the persistent storage area, SAP HANA maintains redo logs that track and record any database changes. These logs can also be automatically encrypted with the AES-256-CBC algorithm. You should protect these logs as much as your data storage; they contain a repeatable list of every action taken to affect the SAP HANA database so that, in the event of a sudden system crash, unsaved changes can be reapplied without data losses. That means an attacker could reassemble some of the database from these logs. Each of these encryption mechanisms uses a different root key, so that an attacker with one key cannot reconstruct your entire database. These keys are held in the instance secure store in the file system (SSFS), which is in turn encrypted using the instance SSFS master key. These root keys can be changed at any time using an SQL call. For data in transit, SAP HANA supports the following protocols: • HTTP-based clients use TLS/SSL as protection • RFC connections can be protected using SNC • Simple Object Access Protocol (SOAP) connections are protected with web services security

CHAPTER 6

To enable this, you’ll need to have the SAP Cryptographic Library CommonCryptoLib on the server. Both this library and OpenSSL are installed by default, but SAP recommends that you migrate to CommonCryptoLib. See SAP Note 2093286 for more information.

104

Encryption | 6

Note: We strongly recommend using secure protocols— TLS, SNC—whenever possible. For more information on enabling the protocols above, see the respective chapters in the SAP NetWeaver Security Guide.

You’ll also need to get a certificate for each virtual machine (VM) that will use TLS to transport secure data. If you’ll remember from earlier in this chapter, TLS uses an asymmetric key based on a certificate stored on the server. If your SAP system is divided into individual VMs for the client, application, and database servers, then each one needs its own certificate. And if your development life cycle includes multiple SAP environments—development, test, and production—then you may need sets of certificates for those if they have TLS enabled. Once you have all of these certificates, you need to create a certificate collection in the database that contains your server’s public and private keys, as well as the public keys of the servers that you want it to communicate with securely. These keys will then be stored in the public key infrastructure (PKI) SSFS, which will be encrypted using the PKI SSFS master key. All these keys, the root and master keys, are generated automatically when your SAP system is first installed. Like any other encryption key, you need to manage their life cycle and expire them on a regular basis according to your key management policies. If your SAP system was pre-installed and configured by a third party, you might want to change those keys once you take over the system. The less opportunity for exposure your keys have, the better. Always back up keys when you generate a new one. If you lose your key, your data will become unreadable.

105

CHAPTER 6

Note: Please note that keys change on reinstallation as well. If you or your provider need to reinstall your SAP system, back up your old keys to prevent being locked out of your database.

6 | Encryption

Passwords on the servers are stored securely using the standard SHA256 algorithm, and are both hashed and salted. This is pretty standard operating procedure for password-protected servers. An internal application encryption service stores any credentials needed for outgoing connections, from things like smart data sources or HTTP destination calls from SAP HANA Extended Application Services (XS) classic applications. On the client side, your trusted connection information is stored in the hdbuserstore. This is a securely encrypted tool that allows client applications, usually custom scripts, to connect to SAP HANA without forcing the user to manually enter password information. But you need to change the default key immediately; some versions of SAP, like ABAP, use a default key for all installations. If an attacker knows the key— say, because they read the manual—they can get access to all your log-on information. Not everything is available to be encrypted. SAP products do not encrypt backups and database traces, as well as some log files. You can apply encryption to these files using other programs, but SAP will not do that for you.

Implementing Additional Encryption

CHAPTER 6

Before you go and encrypt your entire cloud storage, ask yourself: Do I need to encrypt this data? SAP has decent encryption capabilities off the shelf; you may not need to go further than that. Additional encryption may hinder how your SAP system performs, either by degrading performance as the system needs to add another layer of processing on every transaction or outright preventing access if keys are not stored properly. That said, you may be required by law or industry standard to provide some encryption on data considered sensitive or personal. The Payment Card Industry Data Security Standard (PCI DSS), the Sarbanes-Oxley Act, the Health Insurance Portability and Accountability Act (HIPAA), and more require cryptographic protections on specific data types. Several EU member states and California (through California S.B. 1386) require disclosure and possibly fines if personal information is compromised or

106

Encryption | 6

unintentionally disclosed. But organizations can avoid most disclosure if they encrypt all sensitive data. It’s not just the exposure of the data that could hurt you. You could be hit with financial penalties as well. So when considering additional cryptographic controls, you need to weigh the regulatory and business requirements against the financial and performance costs. Every layer of encryption you add may lower your performance, while further protecting data. For daily use data, such as that in the database server, transferred data, and data volume storage, the existing SAP controls offered by the CommonCryptoLib library may be enough. However, you may find that additional data that your systems generate—audit files or backups, for example—need protection that SAP does not provide. Note: Cryptographers do not recommend that you cascade encryption: that is, encrypting an already encrypted file. Existing algorithms have been tested by highly-knowledgeable experts and motivated hackers. Those that remain and have been endorsed by organizations that value secrecy, such as governments, should be considered strong. Cascading encryption only causes performance issues, as it doubles the amount of processing needed to access data. If you find a provider insisting on cascading encryption, it may be a sign that it has the wrong security priorities.

107

CHAPTER 6

In this case, you can get commercially available software to manage your encryption needs. As with key management, having a third-party provider manage your additional encryption can simplify and enhance your security profile. These software packages provide a wide range of services, including total drive encryption, secure file transfer, and spot encryption. You’ll be adding complexity to your system, so think carefully before you jump into encrypting all your data.

6 | Encryption

Conclusion

CHAPTER 6

Encryption provides a final protective barrier on the data itself that can prevent attackers from exposing and exploiting it. Most organizations will want some sort of encryption on both their data at rest and in transit. Fortunately, SAP provides a cryptographic library that helps secure almost all data functions in your system. In the next chapter, we’ll talk about the final piece of the data access puzzle: user accounts and controls. All the controls we’ve discussed so far in this and previous chapters are designed to stop unauthorized access, attackers that bypass the SAP system to get at your valuable data. But how do you control authorized access through the SAP system to make sure that all activity is within the scope of normal business functions? That’s up next in Chapter 7.

108

CHAPTER 7

CHAPTER 7

User Access Controls In the previous four chapters, we’ve talked about the security around getting to your data. We covered how data centers put access controls around both the physical and network pathways to your data. We talked about how a hypervisor seals off your virtualized servers from the hardware to prevent nosy neighbors. And we talked about how, if an attacker bypasses all these other controls, encryption protects the data itself. But there’s one access point we haven’t talked about, and you probably use it every day. In this chapter, we’ll be talking about the security around accessing the data through an SAP client or other interface. These interfaces provide legitimate access to data to those who need it, so there need to be some barriers on bad actors without hindering the people in your organization from doing their jobs. It also needs to make sure those legitimate users don’t stick their noses into areas that lie outside their job function. To protect that access, SAP gives each user one or more defined roles, each with a specific set of privileges. To make sure each user is who they say they are, it uses several sophisticated authentication measures. To maintain individual user security, you can set very detailed password policies, so that each user has a password that matches the level of security that you need. And finally, we’ll talk about how you can audit your SAP system and user actions to catch any funny business if it occurs. User access security takes place in OSI layer 7, the application layer. This is because everything user related occurs within the SAP system itself. You’ll notice that we’ll mention security measures from other chapters here; that’s because those measures all support and serve data transfers on the application layer as well as on their own layers. Much of user security is the same in a cloud environment as it would be on premise. Regardless of where a user is and where the data is, you want to ensure that the user has access to the data to do their job without having access they do not need to have. Almost everything we discuss in this chapter will apply to your existing on-premise solutions. 109

CHAPTER 7

Users and Roles in SAP User access in SAP applications is defined by three levels: • Privileges. The atomic unit of user action, the individual actions that a user is allowed to perform in an SAP system. • Roles. A set of privileges that corresponds with a job function. • Users. The actors in a SAP system, whether human or virtual, who have one or more roles that define the actions they can perform. To set basic user controls, your organization needs to follow three steps: • Create roles with appropriate privileges. • Create users. • Grant roles to users. This may seem fairly straightforward, but there’s a lot of flexibility and therefore complexity around how you configure each of these levels. Depending on what SAP client and database you use, the processes will differ. Let’s dive in. Privileges Privileges cover every possible action within an SAP system, from administrative measures to those that affect data and design-time packages in the classic repository. Privileges can get very granular, especially when it comes to those actions that affect the database. This is so that you can define who gets what access to ensure minimal opportunity for error and abuse. There are five types of privileges that you can assign: system, object, analytic, package, and application. Each of these covers a different domain within a SAP system and requires different considerations. Please see SAP documentation for more details. Roles A role is a collection of any number of privileges that define a business purpose. They allow an SAP administrator to define how individuals will use the system, and how that use will be limited. You cannot explicitly forbid

110

privileges in a role, though any privilege not granted in a role will be implicitly forbidden. You can even include other roles in a role as to quickly add the associated privileges. While privileges can be granted directly to users, roles let you standardize access for users based on their jobs. Roles also let you maintain the principle of least privilege. As we’ve discussed in previous chapters, good security means minimizing the amount of access provided to just what is needed. For roles, that means granting only the privileges that users in this job role will need in order to do their job. It can be tempting to add privileges to roles that could help them work faster. But beware of convenience; it can weaken your security pretty quickly. That goes double for administrator privileges. These are some of the most powerful rights in SAP systems; they should only be handed out to actual administrator-level employees. SAP recommends that you limit most system privileges to administrative users. For more information, see the SAP HANA Security Checklist at http://help.sap.com/hana/ SAP_HANA_Security_Checklists_and_Recommendations_en.pdf. Additionally, some of these privileges should never be assigned to same role or user. SAP recommends that you avoid assigning the following privileges to the same role or user: • USER ADMIN and ROLE ADMIN • CREATE SCENARIO and SCENARIO ADMIN • AUDIT ADMIN and AUDIT OPERATOR • CREATE

STRUCTURED PRIVILEGE ADMIN

PRIVILEGE and STRUCTURED

These combinations would give individual users too much power in an SAP system. These roles are meant to be checks against each other so that at least two people have to review an action before they go live. If someone had both CREATE SCENARIO and SCENARIO ADMIN, for example, they could both create and approve new scenarios without any other person reviewing the scenario. If those roles fell into the wrong hands, new data could be pushed into your system without anyone noticing.

111

CHAPTER 7

User Access Controls | 7

CHAPTER 7

7 | User Access Controls

Some privileges, though, are so powerful that they should not be assigned to any user in a production system. We’ll talk more about the servers you may want to maintain as part of your development lifecycle in Chapter 8. For now, avoid assigning these privileges to active roles or users in your system: • DATA ADMIN • DEVELOPMENT • _SYS_BI_CP_ALL When designing roles, keep in mind the following general principles: •

Create small roles. That is, create roles with the smallest set of privileges that apply to the smallest set of users.

•

Roles that get too broad on either aspect are ripe for abuse. Don’t grant schema-level object privileges unless that role needs to access every object within that schema.

•

Spread out your administrative rights. If all of your system-level functions are attached to a single user account, that account becomes a huge security risk. Do the same for role administration, spread that responsibility around to several users.

These recommendations can be summed up by two broad guiding principles: assign narrow and diffuse. Assigning privileges and roles narrowly means limiting how much a single role or user is permitted to do. Assigning diffusely means spreading the responsibility for system and other significant functions around to several different user accounts. Initial installations of SAP include several predefined roles, as shown in Figure 7.1. These roles can be powerful, but also pose significant risks, so should be treated carefully. Specifically, two roles pose the greatest risks: SAP_ALL, which gives access to everything, and SAP_NEW, which allows access to anything considered new in the system. You should avoid assigning these roles to users on production systems; in fact, you may want to go as far as disabling these roles.

112

Figure 7.1 A list of roles in an SAP system.

We’ve found that the roles most companies create are very specific to their organization. You probably won’t have a lot of use for the predefined roles. Instead, you should focus on creating roles that fit your business needs without opening up avenues for abuse. However, these existing roles work very well as templates for the custom ones that you do want to create. Role and privilege naming conventions between SAP NetWeaver, SAP HANA, and SAP Hybris may differ from each other, but the concepts remain the same.

113

CHAPTER 7


CHAPTER 7


Users Users are the actors that interact with the SAP system, either individual end users or scripts and other external applications. Anyone who wants to interact with the SAP system, either directly or through a client application, must have an associated user. This user controls their access to the system and within it, logging them into the system and managing their privileges. Lots of customers move to cloud environments because of organizational changes, such as mergers, acquisitions, and divestitures. Roles may change in these new organizational structures, but the users won’t as much. You can export users from old systems to import into your new setup. This will give you a head start in rebuilding your organizational structure in the new SAP system.

Figure 7.2 Creating a user in SAP.

114

SAP allows you to create several different types of users, as shown in, as shown in Figure 7.2. The specific types of users that you can create will depend on the specific SAP product that you use. For each user type, you can set different security levels, like requiring different password policies.

User Interfaces to SAP SAP provides a highly modular system through its many products. How you connect your user interface — and what you connect to, (the client or database server) will all affect what user types, roles, and privileges will be available to you. The concepts discussed above will apply to all interfaces, client servers, and databases, though the specific privileges, roles, and user types may vary. Below, we’ll discuss the primary interfaces and what their user access controls look like. SAP GUI The SAP GUI provides an interface to all SAP applications implemented in the historic Dynpro ABAP framework. While this framework is being phased out in favor of the browser-based Web Dynpro ABAP framework, there are plenty of existing applications (and some new ones) that still use this framework. It offers Windows-native, Java-based, and HTML versions to provide maximum flexibility. Users and roles in the SAP GUI can be defined through the ABAP Role Repository. With it, you can design PFCG-based roles with specific privileges and a menu of commands that can be applied to any user. These roles can be composed of privileges that enable access to any transactions, program types, or web locations. The role structures that you create in the Repository will directly translate into interface elements in the interface. Unless a privilege is added to their SAP GUI menu, that user will not even be able to see those functions. See Figure 7.3 for an example of designing a menu for a role.

115

CHAPTER 7


CHAPTER 7


Figure 7.3 Designing a menu for a role in SAP.

SAP Business Client for Desktop The SAP Business Client for Desktop (formerly NetWeaver Business Client aka NWBC) provides a modern desktop interface to SAP applications. It’s built on Web Dynpro ABAP, the successor to the system behind the SAP GUI. While it’s not quite as shiny as Fiori, it integrates with Fiori and provides a bridge to older SAP GUI applications. Users and roles are created in the main NetWeaver server by an administrator. It has two types of privilege available: object and system. Object privileges enable access to specific objects. System privileges, meanwhile, grant access to object types on a system-wide basis. When you use the SAP Business Client, note that it uses Microsoft Internet Explorer browser for rendering. All settings for Internet Explorer that apply to HTML content will apply to your SAP applications, including zoning and cookie settings. SAP Fiori Launchpad Fiori is the most current user interface available. SAP considers this the primary interface and will be designing future offerings on it. SAP’s SaaS 116

offering in the public cloud only supports Fiori. If you are looking for an SAP interface for the future, Fiori is it. An end user can access all of their Fiori apps through the SAP Fiori Launchpad, as shown in Figure 7.4. The interface offers a consistent and flexible user experience across all applications and can be accessed entirely via a web browser. It’s a more modern design, streamlined to have a modular, tiled structure instead of the clunkier interface of previous versions. But some of the apps on the launchpad can open older SAP GUI or NWBC applications, as not everything has migrated to the new interface version.

Figure 7.4 An example of the SAP Fiori Launchpad interface layout.

Roles for Fiori can be created either in the SAP HANA Studio or the ABAP Role Repository. These roles then determine what apps will display in that user’s launchpad. Any privileges not given to a user will not be accessible in their interface. SAP Enterprise Portal The SAP Enterprise Portal is a web front end interface to a SAP, either open to an internal organization audience or the public. You can build public facing websites that automatically interface with your SAP system, so you can feed it ecommerce information, web traffic, or sales leads. You can mix SAP and non-SAP content seamlessly, as well as enable collaboration through discussion features. 117

CHAPTER 7


CHAPTER 7


As this is a front-end for NetWeaver, it creates roles in the same way as the SAP Business client. All logon and user access functions occur through the browser, regardless of whether the portal is open to the public or is isolated on an intranet.

Authentication Once you’ve controlled access within your SAP by configuring users with roles that grant them limited privileges, how do you control access to the SAP system? That’s where authentication mechanisms come in. Authentication covers any method that validates a user’s identity, from manual user name/password input to single sign-on (SSO) mechanisms that use external authentication services to verify identity. Authentication applies to any connection with an SAP system, whether it’s an end user, a client application, or a script. All of these must be associated with a user in SAP, whether a database user, technical database user, or restricted user. Users and authentication are how SAP protects its system from unauthorized access. By tying a physical user or virtual process to a privilege-limited user, SAP can control who gets in and what they are able to do once in the system. Here’s how the connection and authentication process works: • Using the selected mechanism, the system verifies that the connection is allowed by finding the associated user. For user/password authentication, that means asking for the user and password and checking it against its records. For single sign-on, it validates it using the certificates, tokens, or other external authenticators provided. • Once the user passes this stage, the system then checks to see if the user is valid. User administrators can optionally set VALID_FROM and VALID_UNTIL dates to limit the time period in which users can access the system. If the current date falls within the validity period or these settings are not used, the user moves on to the next step. • The system verifies that the user has not been deactivated. Users can be deactivated for any number of reasons, but most commonly, this will occur after too many bad logon attempts. If they are not deactivated, then the user is in.

118

SAP products support several authentication methods, each with their own security measures. Many of these methods enable single sign-on, which means that a user just has to logon once or sign into an authenticator—usually the user’s own computer—to enable access to the SAP system. User and Password SAP supports the basic authentication method, username and password. Users can log on directly in the SAP client that they use using a secure password. That information will be hashed and sent to the client server, hopefully over a TLS-encrypted connection at the very least. On the server, the SAP system decrypts the user/password information and checks it against its own user/password records, which are stored in a hashed and salted format. In this way, SAP secures the transmission and storage of passwords to prevent unauthorized access. However, passwords present an inherent vulnerability if the user chooses bad passwords or stores them insecurely. We’ve all seen co-workers with their password on a sticky note on the computer screen. The methods we use to remember our passwords can be exploited, too. This is an organizational problem. Your SAP system and cloud environment can’t stop a user from keeping a list of their passwords in a text document on their desktop. You will need to educate your organization on the dangers of insecure password storage through security policies. You may even want to look at password management software. Passwords also pose a problem if the user chooses a weak password. Weak passwords are those that can be easily guessed, use a limited subset of characters like only lowercase letters, or are reused in several places. Hackers can use brute force attacks to crack passwords, so the less predictable they are, the better. Don’t use your childhood dog’s name; hackers can run a dictionary attack—that is, an attack that tries every dictionary word—to lower the time needed to break into your account. SAP has tools to assist with this problem, shown in Figure 7.5. You can configure any number of password policy settings that force users to create stronger passwords, to change them regularly, and avoid reusing old passwords.

119

CHAPTER 7


CHAPTER 7


Figure 7.5 A password policy screen in SAP NetWeaver.

SAP applications may offer the following password policy options: • Minimum Password Length. Set the least number of characters that a password must contain. Longer passwords are harder to crack, but are also harder to remember. • Required Character Types. Select whether the password must contain at least one Lowercase letter, Uppercase letter, Numerical digit, and/or Special character, like ! or %. Passwords with more types of characters resist brute force attacks better. • User must change password on first logon. Select whether a new user must change their password, usually set by an administrator or automatically generated, the first time that they logon to the system. This prevents anyone, including the administrators, from knowing that password, as it is usually emailed to the user when their account is created. • Number of last passwords forbidden. The number of previous values that cannot be reused when changing passwords. Previously used passwords have been in use for the maximum allowed password lifespan and have a greater chance of compromise. • Number of allowed failed logon attempts. Set the number of times a user can enter an incorrect password before they are locked out of the system. If an attacker attempts a brute force attack, they will not be able to try again after the maximum number of unsuccessful attempts. 120

• User lock time. Enter the amount of time that a user will be unable to log onto the system without administrator intervention after exceeding the maximum number of failed logon attempts. You can set this to indefinitely, which would require administrator intervention. User locks help frustrate brute force attacks, as an attacker cannot continue attempting to logon on a locked user account. • Minimum password lifetime. Sets the smallest amount of days before a user can change their password. This prevents hackers from hijacking new accounts. • Maximum password lifetime. Sets the maximum number of days that a user can have the same password. The longer a password is in active use, the more chances that it has to be compromised, so users should change them regularly. Only technical database users should be allowed unlimited password lifespans; all others should expire at some point. • Lifetime of initial password . Set the maximum number of days that the initial, administrator-created password remains valid. The initial passwords, because they have been exposed to multiple people, should be changed in order to maintain a secure system. • Maximum duration of user inactivity. Set how long it takes for a password to expire if a user has not logged on. This protects against hijacking users who have left the organization or moved to a job role that doesn’t require SAP. In addition to these policy settings, you can create a password blacklist. This list can contain any word that you don’t want to be used as a password or part of a password, like the company name or the word ‘password.’ This will help prevent easily guessable passwords from being used in your SAP system. These password rules can help ensure strong passwords, but will also make those passwords less convenient to the end user. Like a lot of the security tradeoffs we discuss, how strong of passwords your policy requires depends on the data you’re protecting and your business needs. Measure how much risk there is in having potentially weaker passwords with the productivity slowdowns and inconvenience around using difficult to remember passwords.

121

CHAPTER 7


CHAPTER 7


Single Sign On With the inherent risks and insecurity around user/password logons, SAP has included some more convenient and possibly more secure single sign-on methods. These methods use some measure of trust and verify systems, such as trusted identity authorities, secret keys within a network domain, and XML-based verifications. The major benefit of single sign-on is that it makes logging on to your SAP system seamless. It’s kind of a big deal. SAP offers Single Sign-On as a separate product, which provides securely encrypted log on, password management, identity provider, and token services features that enable the methods described below. With single sign-on, users may still need to logon. But once they log on successfully, that’s when the single sign-on mechanism springs into action. Recent updates have enhanced this process to include a two-factor authentication method. We’ll talk more about that later in this chapter. SAP applications support several different methods for single sign-on. These include Kerberos/Active Directory, SAML, and SAP logon and assertion tickets. Regardless of which method you choose, any authentication with the SAP server should occur over secured connections, either using SSL or SNC, a VPN, or both. Unsecured authentication requests can set you up for replay attacks, where an attacker captures the authentication proof and uses it to gain access. For a user to be able to use one of these methods, they need to be configured to allow that method. You can enable all methods or only specific ones. You may want to only enable the methods that any given user will use as to prevent the possibility of exploitation. If you are using SAP HANA dynamic tiering, it is not possible to disable logon and assertion tickets as an authentication mechanism. The less methods available to use, the narrower your attack surface. Let’s take a look at each of the single sign-on methods and what you need to do in order to implement them in your SAP system.

Kerberos Kerberos is a strong network authentication protocol developed by the Massachusetts Institute of Technology (MIT). It allows client and server applications to prove each other their identity using secret key cryptography. SAP supports Kerberos Version 5 or later using either Active 122

Directory or an external Kerberos server. For HTTP access in SAP HANA XS, you can authenticate with Kerberos using the Simple and Protected GSSAPI Negotiation Mechanism (SPNEGO). If you use Active Directory in your internal network, you can use that to implement Kerberos single sign-on, as Kerberos is an integrated part of that software. You will need to add the SAP servers to your domain controller and either configure a child domain controller in your cloud environment or let your cloud provider manage the domain controller. In this way, the cloud servers become a trusted part of your network. If you do not use Active Directory, you will need to download and install Kerberos. MIT provides a free implementation that you can install on the client server VM. After configuring SAP to use Kerberos single sign-on, you can start mapping users. Both versions require you to map all users to their external identities within Kerberos. The Kerberos server stores the symmetric keys of trusted individuals, so when they connect, it sends back an encrypted session key and a ticket-granting ticket that the client cannot decrypt. With these, the client and server can negotiate and verify the identity of both parties.

X.509 Certificates For both SAP GUI and HTTP/HTTPS connections, including web client and Fiori connections, you can enable X.509 certificates. These are standard public key encryption certificates that allow the server and client to prove their identities to each other. Unlike web of trust public key methods like PGP, X.509 certificates need to be signed by a valid certificate authority. When a user authenticates initially using their username and password, the logon server issues them an X.509 certificate that will authorize them to connect to SAP server for a limited amount of time. Each user needs an X.509 digital certificate stored in the server’s public key infrastructure (PKI) SSFS. The process is shown in Figure 7.6.

123

CHAPTER 7


CHAPTER 7


3 Start desktop client, app, or browser and open connection 4 Certificate-based authentication

Business User 1 Authentication

X.509 certificate 2

SAP GUI & RFC (SNC) Browser (SSL or TLS client authentication)

SAP NetWeaver AS ABAP

Browser (SSL or TLS client authentication)

Secure Login Server (on SAP NetWeaver AS Java)

Other Web Servers

Figure 7.6 SAP authentication using X.509 certificates.

SAML 2.0 SAML 2.0 is an XML-based authentication and authorization standard designed by the OASIS Security Services Technical Committee as an open standard. The XML file publishes a flexible identity assertion; that is, it contains any number of attributes that help verify the identity of the client. SAP uses a specific and limited subset of these attributes as part of its XML file. The XML file that asserts a user’s identity must be issued by a trusted provider. That trusted provider can be an Active Directory server or other identity management provider. In the process of SAML 2.0 authentication, the client presents their assertion to the server, who verifies against a store of trusted certificates issued by a certification authority. You will need to map the SAML identities with database users, either directly in SAP or in the identity provider. This single sign-on method works for both ODBC/JDBC and HTTP connections with either SAP HANA or SAP HANA XS. However, you cannot use this method with SAP HANA Studio.

124

SAP Logon and Assertion Tickets SAP has a built-in single sign-on method that you can use if the previous methods don’t work for you. With logon and assertion tickets, you can create trusted connections with users so they don’t have to continue to enter their username and password every time they access the system. This trust processes requires signed certificates from a trusted certificate authority, like other methods, but all verification and negotiation of authentication happens between the SAP client server and the end user’s web browser. Note: Logon tickets only apply to users using web-based access to connect to an SAP system. Users connecting with a GUI or as technical database users will need to use other methods to enable single sign-on.

To configure this, you’ll need to do the following: • For each HANA database user that you want to authenticate using these tickets, you’ll need to configure them for logon tickets. • Set up a trusted relationship between the server and client computers in the ABAP backend server. • Ensure that the user has the same name in all systems. • Make sure that the client computers and server synchronize their system clocks. This method uses timestamps to prevent replay attacks, so if a client and the server have different system times, these authentication cookies may fail. • Make sure that the end user can accept cookies. When a user initially logs onto the system, that’s considered the initial proof of authenticity. At that point, the server issues them a digitally signed cookie—the logon ticket. That ticket will get them back into SAP systems for as long as that ticket remains valid.

125

CHAPTER 7


CHAPTER 7


Two-Factor Authentication For those seeking extra user security, you can enable two-factor authentication in some SAP systems. We touched briefly on two-factor authentication in Chapter 3, which validates identities based on two factors—something you know and something you have. The authentication measures above rely mostly on something you know—username and password combinations. More recent versions of SAP allow server administrators to add something that the user has. The primary way that SAP enables this is through the SAP authenticator mobile app. When a user tries to logon with two-factor authentication enabled, their mobile app received a timebased one-time password. This password will provide them access in addition to their username and password for a single logon attempt. The second factor in this case, the thing that the user has, is the mobile phone. Because the single-use password only appears on the mobile phone registered to that user, any attacker attempting to gain access needs to either be near that phone or somehow redirect the password. This requires either a specific location or greater skill and increases the difficulty of an attack. If your data is critical enough to require additional measures, this two-factor authentication solution — shown in Figure 7.7 — is simple enough.

126

Two-Factor Authentication and Single Sign-On 537810

SAP Authenticator app

Computes TOTP based on • Secret key • Current time

Second Factor TOTP

Password First Factor

Mobile Device

Secure Login Client or Browser Desktop Client

Authenticate with two factors

Provide security token for SSO

Enter password and TOTP

SAP NetWeaver AS Java TOTP Login Module Verify user credentials (password + TOTP)

Figure 7.7 Two-factor authentication using the SAP mobile app.

Auditing User Actions The previous portions of this chapter discussed ways you can limit and authorize user access. Once a user gains access to the SAP system, you have an additional security tool at your disposal: auditing. Auditing can log any actions that users perform in SAP, so long as those actions affect the database in some way. Audit trails, as they are called, can be highly granular and detailed, recording everything down to keystrokes. The tradeoff here is that these audit files can grow massive quickly and will still need to be analyzed in order to yield valuable information. With a well-considered audit policy, you can spot users who have the wrong privileges, attempts at security breaches, and protect against allegations of data misuse. These trails will catch everything from data modifications and user authorization changes to changes to system configuration. 127

CHAPTER 7


CHAPTER 7


But because these trails can only see database actions (such as read or write) there’s a few important things that they will not catch. For example, you won’t be able to see who triggered database upgrades, as that occurs only when the database is offline. Audit Policy To get started auditing user actions, your server administrator will need to setup an audit policy. A policy defines what should trigger audit event logging. Your system administrator defines these through their associated SQL statements. They can also specify a user to audit for these actions. In some extreme situations, you may want to audit all actions performed by highly privileged users to ensure that they act within reasonable parameters. These sorts of users, if compromised, can cause incredible damage, so monitoring them may help spot problems early. Each audit policy has several pieces that define how it monitors usage: an action, status, target object, user, and audit level. These pieces must be compatible in order to target individual actions. If you want to monitor multiple incompatible actions, you can create multiple policies. The action refers to the specific SQL command(s) performed. These can be indicated very specifically, as in only DELETE actions on a single target object, or they can be very broad, like GRANT ANY which would cover GRANT PRIVILEGE, GRANT ROLE, GRANT STRUCTURED PRIVILEGE, and GRANT APPLICATION PRIVILEGE. As mentioned above, you can monitor all actions, but only targeting a specific user. You can combine multiple (or all) data manipulation actions, like SELECT and INSERT, into a single policy so long as they have a specific target object. The status refers to whether the action succeeded or not. You may want to only monitor successful data INSERTS, unsuccessful authorization attempts, or both successful and unsuccessful GRANT ROLE attempts. Some actions should be watched closer on successes or failures, while others should be monitored regardless. The target object is the specific schema, table, view, or procedure to audit. If you monitor a schema, the audit policy will trigger on any of the objects that it contains as well. In any single policy, the action has to be valid for all target objects. That mostly applies to the EXECUTE object and procedures, as they can only be specified together. You can even create audit policies on objects that do not yet exist, but that object needs to be compatible with the associated action. 128

The user can be either those users who trigger the policy when they perform the action, or those that are exempt from auditing on that action. If you exempt a single user, then the policy will trigger when any other user performs the stated action. Like objects, the specified user does not need to exist at the time that the policy is created. The audit level is an informational flag that indicates how severe the audited action is. There are five levels, from EMERGENCY down to INFO. You can search for these flags when reviewing audit trails in order to address the most severe issues first. SAP contains several default audit policies that cannot be changed. These involve actions that directly affect audit policy, audit logs, or the audit settings. You can’t remove these audit policies because, if an attacker changes these and no audit catches them, then subsequent audits probably won’t catch whatever mischief they get up to in your system. SAP offers a few best practices to prevent auditing from causing too much of a performance strain during operations: • Create as few policies as possible. • Combine targeted actions when possible. • Only audit data manipulation actions if absolutely necessary. • Don’t double up audit targets or target internal database tables, especially when they are covered by administrative actions. Using Audit Trails At some point, you’ll have to review the files that your audit policies create. Whether you create them as .csv files or as additional database tables, you will probably want to use some sort of analytics tools to examine them. These files contain a great deal of information about each tracked action, but for cloud environments, you may want to pay specific attention to the client IP values. When we create audit policies, we tend to suggest the following reports: • Unsuccessful login attempts. • The Early Watch report, which indicates if users have SAP_ALL or SAP_NEW roles, among other security concerns.

129

CHAPTER 7


CHAPTER 7


• User has conflicting roles. For example, someone who can enter an invoice and write a check. These users are ripe for abuse, as a malicious actor could use them to enter an invoice for themselves and write the check without any external approval.

Governance, Risk, and Compliance (GRC) SAP has a several tools that can make a lot of the identity, access, and audit process easier. These fall under the governance, risk, and compliance heading. This suite of products covers a lot of ground, including some that fall outside of the scope of this book. However, we’d like to touch on a few of the tools that do apply to user access security and how they can make your system run smoother and more securely. • Access Control. Allows you to automate access control compliance through regular checks and automatic reviews of user permissions and potential violations. Creates workflows to automate new access requests. Adds a detailed audit trail of all user and role actions. • Cloud Identity Access Governance. Built on SAP HANA, this gives you comprehensive identity management tools so you can create and manage user identities across systems in a single interface. It includes plenty of analysis tools and pre-configured audit reports that help locate risky user access. In addition, this tool includes single-sign on capabilities for your user base. • Identity Management. This tool lets you manage all users across all SAP environments in a central location. You can drive this management through automated rules, customized workflows, and self-service user maintenance. Included with this are powerful reporting and auditing tools that let you investigate individual user events. • Audit Management. Enhances existing audit capabilities by processing all audit trail information with flexible analytics. It lets you automate audit reports and track any audit issues in a workflow. Audit trails contain a lot of information, but can be hard to process. With this tool, you can create reusable report templates that tell the story that internal and external auditors need to hear on a regular basis. 130

Conclusion User access controls the final stretch of network access to your SAP system, and it does so within the system itself. By limiting how much power any individual user has, you limit the opportunity for abuse, either by an external bad actor or an internal user. SAP includes strong, flexible controls that will ensure that the individuals accessing the system are supposed to be there while also making it seamless for them to log back on. And once users are in the system, you can flag specific actions to review later and make sure that their actions fall under acceptable behaviors. This chapter is not meant to be a definitive SAP user access resource. SAP provides security education and documentation on all their releases. Here are some current links to some of their materials: • https://help.sap.com/saphelp_nw75/helpdata/en/4a/ af6fd65e233893e10000000a42189c/frameset.htm • https://help.sap.com/saphelp_nw75/helpdata/ en/4a/114fce13271018e10000000a42189b/content.htm • https://help.sap.com/saphelp_nw75/helpdata/en/ e9/8b24b42425455b9dbe9e7573bdbc7c/frameset.htm • https://help.sap.com/saphelp_nw75/helpdata/en/49/ bf6e8101755d5de10000000a421937/frameset.htm Now that we’re out of the access half of the book, we can start talking about the security controls that affect more policy areas. First up is patch and change management. A lot of the holes that security researchers and hackers find in SAP and the existing networking and hypervisor technologies it relies on receive fixes pretty quickly. Implementing those fixes requires deliberate management policies about their schedules, testing, and downtime. We’ll cover what you should expect from a provider in these areas as well as what you can do on your own systems.

131

CHAPTER 7


CHAPTER 7

7|

132

CHAPTER 8

In many of the previous chapters, we’ve talked about how flaws in software code can pose security risks, whether the software in question is from your operating system, the hypervisor, or the firmware on a networking device. Fortunately, companies look for flaws in their own software and patch them regularly. Most software packages continue to grow and change through regular software updates. These updates can be bug fixes, which close security holes, or they can be feature updates, which add new functionality and potentially new security flaws and bugs. You want the positive changes as soon as possible, but you always have to worry about introducing new problems. New code can mean new coding errors. The upgrade process itself could have issues. In 2007, a survey of 50 IT administrators found that software upgrades failed about 8 percent of the time.1 This isn’t always because the software upgrade is faulty; systems are a complex network of interdependencies, and failures can happen in several areas within linked systems. For important fixes, you and your provider need to apply them regularly and in a timely fashion. Your SAP cloud system may need to have nearly 100% uptime, which may affect your organization’s ability to perform business-critical processes. Your provider needs to balance continual availability with addressing security issues that these patches fix. Your service level agreement (SLA) should cover exactly how much uptime you can expect and if and when planned downtime occurs. The way providers handle this process is called change management. Change management aims to minimize the effects of any new piece of code inserted into a computing ecosystem. It’s an integral part of the Crameri, O., et al.: Staged deployment in Mirage from: Symposium on Operating Systems Principles, Stevenson, WA, October 2007, pp. 221-236 (2007)

1

133

CHAPTER 8

Software Updates

CHAPTER 8

operations and maintenance portion of the software development (SDLC) and operations life cycles, which, if you have an actively developed SAP system, you and your cloud provider are part of. This chapter will cover what you should expect from your provider in terms of change management and how you can configure your VM landscape to support strong controls and systems availability around the software update process, including introducing new objects into your SAP system.

Patches Every piece of software, especially at the enterprise level, has regularly published patches. In a highly networked world, you only need to browse to a web site to find new software updates. In some cases, the software itself will monitor updates and inform you when an update is ready to download. Others, like Windows OS products, may apply patches by default, which can cause problems. Most frequently, a patch closes a vulnerability in software. In SAP, they are called Notes and include a fair bit of information about the specific issue that they address. Your operating system, database software, VPN software, and firewalls will also have patches that you need to apply. Your provider will have a wide range of software and hardware that they need to monitor, including the hypervisor, network appliances, and firmware of all the hardware parts of their cloud environment. When a patch is applied in a system, it usually affects an executable file or other execution library that loads into memory when that program runs. Because of this, the program often needs to be shut down before the patch can be applied. For system critical software, including hypervisors and OS files, this usually means that the entire system may need to be shut down and restarted. But no software stands alone, especially not in a cloud environment. Your SAP system links multiple pieces into a whole: your backend, your client server, your GUI, and more. You might even have different client applications to access your data. If your system draws data from or sends data to other applications, then that’s another dependency that’s part of your upgrade complication. There’s a lot of things that can go wrong during an upgrade. Maybe some configuration change overwrites your custom settings. Maybe the 134

process of upgrading is flawed, and breaks something. Maybe it introduces a new bug that only shows up with your particular set of data and settings. Or maybe there’s a piece of hardware that you use that doesn’t play nice with the new changes. In creating a change management program, you and your provider both, separately and together, need to follow three steps: identify, test, and implement.

Identify Before you start downloading patches haphazardly, you need to take a full accounting of the possible items that could change. This includes all of your software and hardware configurations, both in your client connection and in the cloud environment. But that’s not all; you will need to monitor your licenses, encryption certificates, supporting libraries like ODBC and SQL, and manage the life cycle of hardware and software. First, create an accurate and up-to-date inventory of all the software and hardware that you use. Find the websites for each vendor and ensure the provider or your IT department monitor their support channels. Hopefully, these companies will push changes to you, either within the software or as emails, RSS feeds, or social media communications. Your provider should be able to provide a list of all the software that they monitor and how they determine when something needs an upgrade. If software that you use is not on their list of supported items, you’ll need to figure out how you can identify when patches need to be applied. Note: IBM maintains an entire division around monitoring and testing any changes that occur with any of the software that we support. This division, IBM Security Service, has a number of ethical hackers in the IBM X-Force Threat Intelligence team, who attempt to find new exploits in any updated software. In addition, they monitor security channels — researchers and black hat backchannels — to get early warnings on potential software security flaws as they come to light. They provide guidance on when patches should be applied. X-Force has been doing this for years and is a welloiled machine. You can read their yearly threat intelligence report on their website.

135

CHAPTER 8

Software Updates | 8

CHAPTER 8

8 | Software Updates

Next, maintain a list of your expirations. Most software licenses operate on yearly renewals, though some restrict you by usage numbers. Usage numbers can be anything from the number of actual users on a server to the amount of transactions or the value of all assets in an accounting system. Centralize this information in your IT department, so when expirations happen, they can send renewals through before any disruptions occur. In many cases, your providers will be responsible for license management, so ensure you have a clear understanding of their management processes. If your provider finds a patch that needs to be applied to your cloud environment, they should notify you that this change will be part of their next implementation (discussed later in this chapter). The exact methods of this notification should be worked out between you and the provider before signing any contracts. Some issues are so critical that it’s worth bringing a server offline immediately and plugging the hole. Those are rare. The application itself may have some sort of automatic update function. We find that these functions can be incredibly disruptive in cloud environments, as they can apply patches without anyone in IT knowing about. Sometimes, they can even trigger restarts and take computers offline at unplanned times. Once you have identified a patch or other change, you need to identify how severe the issue that it fixes is. Not all issues are created equal. Assess the vulnerabilities using a standard risk evaluation. What’s the likelihood of this being exploited? Is it easy? Do attackers have to go through other security controls to get to it? How bad would this impact your business if exploited? More often than not, issues will be given a severity rating by the vendor. Either way, it’s worth determining how severe the issue is to you and your organization. Based on that severity, it’s time to push that change into the schedule. The first step in any change schedule is testing the new changes. You need to make sure that the cure isn’t worse than the disease.

Test As we’ve mentioned before, software is written by humans, and humans are often well-intentioned, but never perfect. As a result, most software will have some flaws, and that’s why we need patches. But the patches and other updates are also written by humans, so they can introduce new 136

problems into your system. That’s why you need to test any change that you make to your cloud-based SAP systems before applying them to live, in-use servers. According to a report from The International Working Group on Cloud Computing Resiliency (IWGCR), unplanned outages cost cloud providers over $70 million between 2007 and 2012. These errors hurt, so you and your provider should be doing everything they can to prevent them. When testing software, you may want to use multiple SAP systems. These additional systems provide fail safes so no surprises hit your live data the moment you install a new patch. Many organizations use up to four different systems as part of their SDLC (shown in Figure 8.1): • Development. The first line server, where new SAP objects, roles, and views can be created and system changes first applied. Development (informally, dev) isn’t a throwaway system that you can wipe anytime things break. It’s a record of all changes applied to your SAP system, so you should attempt to maintain continuity. However, this server probably won’t have the same security controls as your production server, especially when it comes to user access. • Test. Once a change is released from dev, it moves over to the test server. This server can and should be broken during the course of testing. If something can go wrong with an update, you want it to go wrong here. You should be willing to wipe and restore this system regularly. Often, test systems are created as a copy of production. If you store regulated data, like sensitive personal health information, using real data can be problematic. You may want to create manufactured “dummy” data that resembles real data but has no connection to it. Either way, the data on this system should have the same format of real data so you can simulate how the system will be used, but not necessarily the entire set. Your security controls will probably be weaker on this server, same as on dev, though if you use production data, you may want to use the same user controls. You can use the Central User Authorization (CUA) tool to manage users from a central location across different systems. • Staging. This is the waiting room of servers. Anything that garners the approval of your QA team can come here to get a final test on real data. Staging and production systems are typically physically identical, with almost the same data, hardware, and security 137

CHAPTER 8



CHAPTER 8

controls. This server is an optional step that some users like; it provides an extra level of security and testing. • Production. The last stop for any changes. This is the live, in-use data that your organization interacts with every day. You want to weed out any possible problems before any changes get to this server stage. Any problems that get to this stage can cause you downtime and delays. All of these systems should be isolated on the network into their own security zones. Development

Testing

Staging

Delivery

SAP Dev

QA Server

Clone of Production

Production

4

5

Syndication

1

6

Export Import

Import

Import

2

Users and Roles Export

3

Development

Test

Production

Figure 8.1 An example of the VMs involved in a testing scheme.

Your provider will have their own testing procedures for their upgrades. It may be difficult for them to simulate the entire environment, especially if the patch affects a network appliance or other hardware/software with a lot of dependencies. They can create a test VM on a server and let it operate in the larger cloud ecosystem, but that gives up some of the isolation controls one should have on any sort of test environment.

138

When creating your own SAP test VM, you’ll need to create a separate SAP installation, complete with all servers and modules that you use. You can use a subset or older backup of your data, though actually putting together a subset of data can be very difficult. SAP has a product, Predictive Maintenance and Service (PDMS), that can create data subsets and obfuscate sensitive data, though it’s complicated, and we’ve yet to see anyone implement it successfully. Note: Be very careful with test data and the controls you have around them. Auditors pay close attention to test servers, as these are very easy to leave unsecure. Even if manufacturing data is difficult for your system, you may need to do so in order to meet your regulatory and compliance obligations.

Some of our customers like to have a staging server as a buffer between testing and production just in case. This gives you a second chance to catch any problems that the change may cause. This system should be an exact copy of production. Controls, data, all of it should be the same. If anything slipped past your testing server, it will probably be because of some difference in the setup of your test and production servers. Why not just have test be a direct copy of production instead of having a staging area? User controls. You don’t want to give everyone access to all of your live data. On your test server, QA users will probably need wide access. On a staging area, you can limit that while still watching for unexpected results. Your security controls on the test server will necessarily be different. Because there will be less users, roles will be more concentrated. You may have the SYSTEM user active or users with either the SAP-ALL or SAPNEW roles. As these are not live systems, these users won’t be able to abuse their powers, like entering an invoice and then writing a check for that invoice. However, depending on what the data is, you may have privacy/regulatory concerns on your hands. Testing should follow best practices. Your QA team should be able to build automated and manual testing that covers the three major test cases: • Unit tests. Individual test cases that validate whether a single component works correctly.

139

CHAPTER 8



CHAPTER 8

• Integration tests. Make sure various system components work with each other. • Functional tests. Run through real-life scenarios from start to finish to make sure that end users can work in ways that make sense to them. In addition to these, you may consider asking actual users to perform actual tasks in the system to verify that everything works correctly when a real person runs the system through its paces. These steps should apply to all changes, not just patches and upgrades. When you make significant edits to your SAP system—things like adding roles or schema—you should test those as well. The last step in any testing plan involves how QA certifies something as ready to go. You’ll need to define what passing looks like, as will your provider. Set up specific milestones that a test needs to pass—the unit, integration, and functional tests. On the other side of the coin, you’ll need a plan as to how to handle a failed test.

Implement When you’re satisfied with the tests you’ve run, it’s time to move the change over to the live servers. But implementing a change isn’t as simple as flipping a switch. Live servers could be in use at nearly any time, especially if your organization has a global presence. Depending on your service agreements, who’s responsible for implementing patches can differ. At IBM, we have three levels of patch responsibility, selectable per VM: • Automatic. If we identify a patch, we schedule and apply it automatically. We schedule everything, and the customer doesn’t worry about it. If anything goes wrong, that’s on us. • Manual. We let our customers know what patches are available. If they want to apply one of them, they submit a service request with the date to apply it. If that goes wrong, again, that’s our responsibility. • Customer-initiated. The customer takes all responsibility for patches. If anything goes wrong, that’s their responsibility. We suspend our service level agreements (SLA) in this case. 140

In any of these cases, if you as the customer delay, defer, or reject a patch beyond the date we recommend applying it, you accept all risks and responsibilities around that action. Your SLA will probably be suspended in case that unapplied patch causes an outage. Any provider that you find will likely offer some set of these options. By default, we put all customers on the automatic patch plan. It’s the easiest for the customers, ensures that any security flaws get closed, which protects both their data and our servers. But this may not work for you. If you have lots of customizations or software that your provider doesn’t support, you may need to apply some of your own patches. Whether you apply patches or your provider does it for you, that change needs to be scheduled. You can’t just apply the patch whenever you want. Servers may need to restart or otherwise come offline in order to apply changes. You’ll need to schedule the patch in a maintenance window, when traffic is low and you can take servers offline. We require an eight-hour window for our maintenance and find that Saturday nights are the best time to do it. As the customer, you should be able to decide when the maintenance window occurs. Maybe you have especially high traffic on weekends, as a restaurant or hotel business would. Your slow days will come at another point in the week. You should be able to gather this data based on internal SAP system usage. At IBM, we use a four-week, rotating schedule based on the SDLC servers a customer has as seen in Figure 8.2. Week 1

Week 2

Week 3

Week 4

Patch Schedule:

Patch Schedule:

Patch Schedule:

Patch Schedule:

Development

Test

Production 1

Production 2

Figure 8.2 IBM’s four-week patch rollout schedule

141

CHAPTER 8


CHAPTER 8


Some providers offer rolling upgrades, where servers do not go offline. Instead, the data and servers shift to duplicate systems while the upgrade occurs. The upgrades roll through a data center, server by server. When the upgrade finishes, a client’s VMs shift to the now-upgraded system. However, rolling upgrades can pose some problems. Because of the extra strain on the data center’s server, clients may see some amount of performance degradation. Rolling upgrades only work when the upgrades maintain backwards compatibility, which is not always the case. It can be especially problematic on systems that maintain persistent data structures, as SAP does. Upgrades may change the way that data is accessed, causing the new and old versions to be unable to talk to each other. If you have any sort of provider-implemented change management, tell your provider when you change something. If you think change is hard, wait until you see how hard unplanned changes are. If you implement a change on your VMs without telling your provider, you’ll throw off their testing environment. When they test a patch, it won’t have the same dependencies and environment, so issues could develop only when it hits the live server.

Conclusion Managing change can be difficult, but not utilizing well-established change management processes can pose greater risks. To make sure that your cloud environment changes in a way that minimizes disruptions to your cloud-based SAP system, you need a process that manages it. Researchers discover security flaws in software all the time, so you need to know when you need to patch a system. Those patches can sometimes contain their own problems, so before you go live with them, you need to test them and make sure that they work in your software ecosystem. Finally, you need to bring those changes into your production system in a way that minimizes their impact on your workflows. This chapter was all about managing software change. In the next chapter, we’ll cover what your provider can do about hardware changes; specifically, what they can do when storage media no longer holds a client’s data or is no longer in use. The ones and zeros inside a hard drive can be remarkably persistent, so your provider should do something to prevent access to the data that once resided on this drive once it outlives its usefulness. 142

CHAPTER 9

Data protection is the most common reason companies use the bulk of the controls we’ve talked about thus far, including network security, virtualization protections, and encryption. But as we’ve mentioned before, that data lives on a physical device somewhere. What happens when either you stop being a customer at a provider or its hardware fails? Does your data still live there? The short answer is yes. Note: At IBM, we treat all customer data as confidential. Your data holds such great value to your business, your provider should be as careful with it as possible. That includes every step of the way, including after you stop being its customer.

Your provider should have policies and procedures in place to protect your data from being stolen from deprovisioned or broken hardware. It’s a problem that is too often overlooked. In 2009, British researchers purchased 300 hard disks on eBay and other auctions. Besides a pile of employee and non-employee personal information, they found plans detailing highly-sensitive missile defense systems, including test plans and facility blueprints. Not something you want any regular Joes to get their hands on. The same goes for cloud-based storage. You don’t want the next virtual machine (VM) assigned to your old storage space to be able to read any bit of your stored data when you leave. To operate at an efficient scale that provides the value that you need, cloud providers have to reuse their old equipment and systems. But how can they reuse that equipment without exposing previous customer data? That previous data has to be destroyed to truly secure it when it no longer uses the physical media that contained it. There are three levels 143

CHAPTER 9

Data Destruction

of data destruction, each an escalation of the previous step. In order of severity, these levels are disk erasure/overwrite, degaussing, and media destruction. In the remainder of this chapter, we’ll talk about each of these methods, what they entail, and how your provider should escalate from one level to the next. Data destruction processes are table stakes in cloud computing. We’ve found that most customers assume that this process is done for them and they don’t have to worry about it. That’s a mistake. If they do ask, they only ask if we have a process. But we think it’s important that you know the full security life cycle of your data, from write to delete.

CHAPTER 9

Overwriting When a customer leaves a cloud provider, the customer becomes responsible for its data. In all likelihood, it will take possession of the data—download it—and store it somewhere else. The cloud provider, meanwhile, needs to reuse the computing resources that the client vacated. But it still has a copy of that data on its storage devices. It can’t just keep it there; that could expose the data to new clients. The first step in destroying data is overwriting the data on the storage medium. Just deleting all the files won’t work; all that does is remove the pointer to that data. The data still exists and can be found using software tools. Overwriting data replaces the actual data on the disk with random data. Standard hard disk drives store data in a series of ones and zeroes on a literal disk, reading and writing areas on this spinning disk with a mechanical arm. In the hard drive’s case, one and zero translate to magnetized or not magnetized. Any given disk has billions of little areas that can be magnetically charged to create the information that a computer uses to store and read your data. To guarantee that the old data can’t be read, a provider needs to randomly set the magnetization for each and every one of these little areas. Think of the cloud provider as a landlord—It needs to wipe out any traces of the previous tenant. So besides a thorough clean, the landlord could paint the walls, rip up the carpet, and replace other things that show evidence of the previous tenant. The landlord won’t change everything; that’s not economical. It will keep most of the existing infrastructure. But it will remove dirt, paint, and nail holes—the information—left by the previous tenant. 144

Besides providing a clean space for the next tenant, there may be regulatory reasons that data must be wiped out. The Health Insurance Portability and Accountability Act (HIPAA) mandates formal policies around media reuse to safeguard personal health information. Other regulations may not explicitly mandate media reuse policies, but require that sensitive data be secured. Regardless of regulations, if you have any sort of sensitive data, you should consider how the provider performs this function. In IBM Cloud Managed Services, we use U.S. Department of Defense-compliant overwrite methodology to erase and reuse disks. While this is an informal standard, by virtue of it being one of the first high-level documents to suggest a specific procedure for data erasure, it became something of an industry standard. The concepts in it have since been fleshed out in the NIST Special Publication 800-88, Guidelines for Media Sanitization. The original methodology used three passes: write zeroes in all areas, write ones in all areas, and finally write randomized data to all areas. However, research in the past 10 years has found that a single overwrite pass, using randomized data, is enough to erase the previous data. However, this overwrite pass needs to affect all disk areas, including unassigned and defective areas. Solid state drives (SSDs) work a little differently. They don’t store data in little magnetized areas. Instead, they capture electrons in gates, then amass those gates in memory pages, and those pages in blocks, and finally store those blocks on several separate memory chips. Solid state drives are faster because there are no moving parts. But while they can read and write at the page level, they can only delete at the block level. On top of that, reading and writing slightly damages SSDs on a physical level, so the drives are constantly moving pages around to ensure even wear on all blocks, a process called wear leveling. Unlike magnetic hard drives, where a provider can overwrite any given area at will, SSDs can’t. Instead, SSDs have to apply a voltage spike to a given block to reset it to the original erased state. To perform a secure erasure on SSD storage, your provider needs software to apply this voltage spike to all blocks at the same time so the wear leveling algorithm doesn’t move data to a freshly erased block. Most manufacturers supply this software, but protect it from general use, as it irrecoverably destroys all data on the SSD media. A simpler method may be to encrypt the entire drive, then destroy the key. 145

CHAPTER 9

Data Destruction | 9

CHAPTER 9

9 | Data Destruction

As with most computing concepts, clean disk erasure becomes more complicated with virtualization, especially when multiple VMs share space on the same storage medium. The hypervisor manages all storage mapping on the physical drive, so any erasure software needs to be installed and have access to the hypervisor level and below. With this, it can locate and, on magnetic hard disks, overwrite the ones and zeros. On SSD storage, it can prepare and target specific blocks for reset. When a client is deprovisioned, your provider should immediately isolate that client’s existing computing resources. Once the storage has been secured, it can perform a disk erasure on that portion of the storage media. Finally, before reprovisioning those sectors, the provider should wipe the media again. These disk erasures guarantee all previous data is unreadable and unrecoverable. The physical media can be reused for the next customer without exposing old data. This method generally works for storage media that the provider no longer has a use for and wants to donate or resell. However, if disk overwrites/erasures fail—and they do sometimes—there are stronger measures.

Degaussing If a magnetic hard drive cannot be overwritten due to a logical error in the data, or as a condition of policy for things such as a sale or donation, the provider’s next step may be to degauss the drive. In simple terms, to degauss a hard drive, you just run a strong enough magnet over it so the stored ones and zeros are scrambled. Depending on the drive design, the provider may be able to reuse the drive afterwards. Some drives, however, have stored servo data that controls how the drive looks up and restores data. Degaussing destroys this data as well, rendering the drive inoperable. If a provider uses degaussing, generally it will outsource that function. You can’t just break open an old stereo speaker and rub it over your drive; this process needs stronger magnets, like those made from neodymium. As part of the outsourced vendor’s security services, it should provide full chain-of-custody documentation, so while the data is still on the disk and in transit, your provider need not worry about someone copying the drive.

146


SSD storage, as we discussed above, does not use magnetic charges to store data, so degaussing does not work. The only absolutely certain way to destroy the data on a broken SSD device is to physically destroy it. Degaussing often occurs before selling, donating, or physically destroying the media. While physical destruction usually prevents any remaining data from being accessed, degaussing the drive first ensures that any recovery methods will fail. It may be possible to recover data from shredded hard drives using microscopes, so if you plan on putting sensitive data in your cloud environment, you may want to insist on degaussing before shredding on any media at the end of its life.

As we know, computer equipment doesn’t last forever. It either outlives its usefulness, becomes outdated, or breaks after the warranty expires. Hard drives and SSD media can still hold data even if they don’t work within a computer. Sometimes software can recover it. Other times, it’s a broken head controller or shorted diode that needs to be replaced. In other words, dead hardware can still reveal your secrets. That’s when it’s time to securely destroy the drive. You probably shouldn’t just take a hammer to the drive or run over it with a truck a few times. Plenty of professional companies will shred or crush your hardware for you, ensuring a secure chain of custody the whole way, as shown in Figure 9.1. Your provider, if it processes a lot of old hardware, could shred or crush hardware on site if it has the machinery.

147

CHAPTER 9

Destroying equipment

CHAPTER 9


Figure 9.1 The remains of a shredded hard drive

In IBM’s cloud environments, we use a process we call “redrum,” after the famous quote from The Shining. After performing a secure overwrite and degaussing, we send it offsite for destruction (Figure 9.2). Our offsite destroyers punch a hole in the drive so nothing can read it.

148

CHAPTER 9


Figure 9.2 The view of the underside of a drive after a “redrum”

So why don’t we crush or shred? Simple: if our offsite company shredded or destroyed the drive, we would not be able to verify that the drive was actually destroyed. One mangled pile of electronics looks like the next. By maintaining some of the physical integrity of the device while rendering the inner mechanics unusable, we can track the drive by serial number. With SSD storage, though, punching a hole in the middle isn’t enough. You have to shred it to a fine enough size in order to destroy each and every memory chip that is part of the drive. As SSD stores memory in individual chips, those chips can be recovered and read. Granted, that is a very difficult process that requires overcoming SSD-native encryption, but it is theoretically possible.

149


CHAPTER 9

Backup Media Backup media is a special case. Most backups are stored on digital tapes. Tape stores data magnetically in a linear-access fashion, whereas hard drives and SSDs are direct access. They’re cheap, high capacity, and have no inherent code or processing capabilities. They work great for storing data that doesn’t need to be in active use. It’s ephemeral storage, not intended for long-term use. As such, tape is replaced fairly regularly. Because of this, IBM shreds tape as the first step in data destruction. You could degauss the tape to remove data, but the media is relatively fragile and very inexpensive. Shredding is easier, as the tape is just a strip of cellulose acetate, or as it’s more commonly known, plastic. Some providers do reuse tapes in various schemes. The simplest is to have one tape for each day, overwriting as the week repeats. If your provider wants to keep older data, it can keep the daily tape rotation while also including weekly and monthly backups. That way, the worst that happens is a loss of a month’s worth of data. Backup tapes generally stay in use until the end of their useful life. That means when you or another client leaves, your provider doesn’t shred your old data. It can’t. Backup tapes are usually not limited to a single customer. In 2014, Sony and IBM were able to record 148 GB on a single square inch of tape, which meant that the whole tape could hold 185 TB. Almost no customer will have that much data to back up. Of course, if all customers associated with the data on a backup have left, sure, that’s going in the shredder.

Conclusion Your provider needs to have a plan as to how to securely reuse or dispose of old hardware. The data it held can come back to haunt you, so secure controls have to extend throughout the entire life cycle of that media. For storage media, there are three options: overwrite, degauss, or destroy. These all take place as part of a chain of escalation; if one doesn’t do the job, try the next up the chain, with destruction being the last and most secure option. In the next chapter, we’ll move from discussing the actual controls and methods you and your provider can use to secure your SAP cloud environment to talking about the policies you’ll need to manage those controls. Security isn’t a set it and forget it sort of thing; it requires constant vigilance to stay up to date with a changing threat landscape. 150

CHAPTER 10

Up to this point, the vast majority of this book has covered the physical and virtual controls that help secure your cloud environment: ensuring that a data center has backup generators, blocking ports to prevent unwanted network traffic, destroying old hard disks when they’ve reached the end of the road, that sort of thing. In this chapter, we’ll start talking about the policies and procedures that work hand in hand with your IT, HR, and administrative controls. This chapter will primarily discuss the elements that go into an information security management system (ISMS). This is a framework of policies and procedures established to protect the confidentiality, integrity, and availability of an organization’s data. It covers nearly every part of an organization, from the equipment to the people who use it. No one person or piece of equipment can secure your organization, so it’s important to have the policies and procedures in place to ensure all employees at your organization know how they can maintain good security controls. When we talk about you or your organization in this chapter, we could be referring to either your organization as the cloud services client or your provider’s organization. Many of the controls discussed here as part of a proper ISMS apply to both parties individually. Some of them apply to both of you together.

Information Security Management Systems Although we will examine risk management processes in more detail in the next chapter, we can’t completely separate an ISMS from risk. The primary goal of any ISMS is to reduce your risk from security threats. You and your cloud services provider should establish and maintain an 151

CHAPTER 10

Information Security Management

CHAPTER 10

ISMS based on both of your business goals and objectives, jointly and individually, in order to minimize your collective risks efficiently. You’ll both have different needs, depending on what your business does, the size and breadth of your organization, and how sensitive the data that you seek to protect is. Your provider will want to maintain its reputation and minimize its costs. To create and implement an effective ISMS is a complex task. The level of complexity comes from applying a risk management process to the organization’s information assets and managing the expectations of the asset owners. These assets, systems, and the organization itself are the target of threats, and protective policies, processes, and people have inherent vulnerabilities. You may have to consider various regulations that further complicate the process. Of course, nothing sits still in cloud computing, as these elements can change and interact differently over time, which means your risk levels change and your ISMS needs to evolve and change with the times. As a result, an ISMS must be updated regularly. An ISMS is only as effective as the controls over processes, organizational structures, software, and hardware functions that it contains. These controls will only be effective if management fully supports both the individual controls and the ISMS as a whole. An ISMS generally defines controls based on the category of risks that they mitigate. These could include: • Physical and environmental security • Access control • Change management • Incident management • Vulnerability management • Communications or network security • Encryption • Asset management • System maintenance and development • Human resource security • Compliance • Third-party vendor relationships 152

Information Security Management | 10

• Mobile security • Disaster recovery and business continuity management • Security organization • Security policy management How important each of these categories is depends on your organization and its business needs. No organization is the same; if they were, we could all use the same ISMS and not worry about creating our own. Not coincidentally, many of the control categories above correspond to chapters you’ve read earlier in this book. For the remainder of this chapter, we’ll talk about the controls that could be part of your ISMS and how they can be employed based on an organization’s risk level and risk tolerance. More information on these controls along with example implementations can be obtained through the International Organization for Standardization (ISO) publications related to the ISO 27001 standard, available for purchase at http://www.iso.org.

A solid ISMS builds on a foundational triad—people, process, and technology. The first line of defense for your technology is physical security. Around your information technology assets, your provider should define a perimeter to protect and control areas that contain information and information processing facilities. Anyone who wants in through these perimeters must be authorized. Even then, it should provide additional measures to enhance physical security for offices, rooms, and facilities that contain more sensitive information or technology assets. All access points need to be secured. Legitimate entry points, such as delivery and loading areas, maintenance access points, or front entryways, should have proper controls. Other potential entrances, such as emergency exits, need to be secured as well. The facilities should have measures that isolate information processing facilities in the data center from any persons entering, either authorized or unauthorized. Multiple perimeters help guard against malicious entries. Providers should choose the location of the cloud service provider or customer data centers to minimize threats from natural disasters, such as floods and earthquakes. While the location can’t guarantee against accidents or malicious attacks, your provider can implement measures 153

CHAPTER 10

Physical and Environmental Security

CHAPTER 10

10 | Information Security Management

to protect equipment and reduce the risks from fire or power failures and other disruptions caused by failures in supporting utilities. Network and power cables that feed the data center need to be protected from interception, interference, or damage. Equipment needs to be maintained to ensure that it functions reliably. While there may be legitimate reasons to take assets or information offsite, a provider should establish processes and procedures so that assets or information don’t leave the building without prior authorization. Maintain agreements with vendors, partners, and suppliers who may be authorized to take assets offsite to ensure security and take into account the different risks of working outside the premises. The measures specified in these agreements must be monitored for compliance. Before deprovisioning a customer or removing media from use, all storage media must have sensitive data and licensed software cleared using secure data destruction procedures. It’s not enough to just wipe the drive and move on; your provider needs policies that verify the secure reuse or disposal of media. In your own premises, you may want to implement clean desk and clear screen policies for all facilities, offices, and equipment. Any information left visible in a facility makes it easier for unauthorized entrants to steal information. Controls in this area correspond to the issues and controls we talked about in Chapter 3. Responsibility for these protections will fall mostly on your provider, though you should absolutely create policies that protect your own facilities.

Access Control Besides protecting the physical equipment behind your technology, you need to protect access to it, whether physical or virtual. While people are another of the three foundations of a good ISMS, they can also be a liability. Security measures to control access to information resources, devices, and networks begin with the establishment of a formally documented access control policy. Business and information security requirements will change, so you and your provider need to regularly review these policies to reflect that.

154

The policy should be based on the principle of least access—providing users with the minimum amount of privileges needed to perform their job duties. Asset owners and management must authorize and review users’ access rights to assets at regular intervals to make sure they still need to have that access. This process of assigning, changing, and removing access should be formally documented for all users, systems, and services. Access is a sensitive business, and those who grant and use these privileges must be subject to some security controls. This could include extra approvals and logging and monitoring of when individuals use access rights. When employees no longer need to have access to a resource, either through termination or transfer, your organization needs to specify a formal deprovisioning or access removal process. As many access systems rely on passwords, your organization needs to regulate those passwords with a documented process that ensures all passwords are strong and regularly changed. The access control policy can require secure log-on procedures, such as two-factor or biometric authentication mechanisms, to protect more sensitive data and systems. Your security policy may need to cover software or utilities that might be capable of overriding access controls and limitations on developer access to program source code. These programs sidestep the access controls your policy covers, and as such they pose an additional security risk. These policies correspond to what we discussed in some of Chapter 3 and most of Chapter 7. User access applies to both how you physically allow people to enter the building and who gets to use the SAP system. Determining access to your SAP system will be almost all your responsibility, while access to facilities, whether the data centers or your premises, falls on both you and your provider.

Change Management Change happens regularly in technology-centered environments, but that change needs to be part of your ISMS. Your provider should have a plan to monitor how much of its overall resources you and other clients consume over time and what the process is when it needs to expand to meet your performance requirements. Based on the results of this monitoring, it should be able to project what future capacity it needs to add in order to meet the needs of its clients and prevent costly outages. 155

CHAPTER 10


CHAPTER 10


Your security policies regarding backups should clearly specify their content and frequency. You’ll need to specify what a cloud service provider backs up as part of its default service and what you need to purchase or provide in addition to that. Data on backup media should be protected against unauthorized access, misuse, or corruption at all times. The division of responsibility for security in transit and storage should be specified in service contracts. If you use a third party to transport backup media, hire only bonded and insured carriers. Consider auditing their transport procedures to ensure that they comply with your security policies. Before you execute changes to your production servers, back up your data, software, and system and test them to ensure they contain what you want to save. Any changes, whether through upgrades or normal activity, should be logged. Typically, enterprises use a form-based Incident, Problem, and Change Management System (IPCMS) to track systems, target, versions, contingency, and approvals of change events. Additionally, event logs that record user activities and security events must be kept for a minimum time and reviewed at a specified frequency and level of detail to ensure adequate control over the system. Logs themselves need to be protected against tampering and unauthorized access; otherwise, attackers could cover their tracks by changing the log to remove evidence of their breach. In an investigation, the timeline that you create out of multiple logs would be inaccurate if the clocks of all relevant information systems weren’t synchronized. The more you segregate the duties of the IT departments that maintain your cloud environment, the less you risk unauthorized changes in your environment. Even authorized changes can cause problems in complex systems, so ensure that you have proper testing procedures in place. The servers on which you develop new changes, test them, and run your live SAP servers should be logically separated on your network to help manage your change process. We covered change management in depth in Chapter 8. Your provider may offer some amount of change management as part of its default offering or some level of integration with your enterprise systems. Even if it does, you should internally determine what your security policies are around changes, then audit your provider to see how it stacks up.

156


When security incidents happen, you should have information security policies in place that make your response effective and orderly. When something goes wrong, your people can panic. With a documented security incident process, everyone in your organization will know who, how, and when to report information security events through management channels. The personnel who use information systems and services must understand how to recognize and report any observed or suspected information security weaknesses in systems or services. The more eyes you have on potential problems, the better. At the same time, your policy needs to specify how you assess those weaknesses to determine if they are actually information security issues and, potentially, future incidents. Many of our customers ask us what our time frame is to report security incidents when we find them. That notification process has to be documented as part of your security policies. The triage and classification system described above helps weed out false alarms and less critical events, which helps reduce incident clutter and maintain a quick response time. In the event of a major incident, policy and procedures help minimize their impact. To facilitate a quick response, you and your provider should establish the appropriate contacts at the relevant authorities before incidents occur. Maintain contacts from specialist forums and professional associations so you can rely on them in an incident if you need outside expertise or objective analysis. After an incident occurs, you and your provider should analyze your process so that you can gain knowledge from the resolution. That knowledge can be constructively applied to reduce the likelihood or impact of future incidents, drive changes in systems or processes to enhance efficiency, and increase future security. Your ISMS should specify how this knowledge is to be preserved and what the identification, collection, and acquisition of incident information processes look like. Detailed information like this is more valuable as forensic evidence. Because incident management operates both during and after a breach or other security incident, we haven’t covered it in our discussion of controls. It’s an organizational control defined by policy and purely management.

157

CHAPTER 10

Incident Management


CHAPTER 10

Vulnerability Management When we talk about vulnerability management from the perspective of an ISMS, we’re mostly talking about patch management and anti-malware controls. You should have a process to constantly gather information about the known vulnerabilities in the software and hardware that you use. You can contract with third-party vulnerability scanning and penetration testing vendors—discussed more in Chapter 12—to locate the existing vulnerabilities in your system. With constant vigilance, you can minimize the time between an exploit’s entry into the wild and when you’ve closed the vulnerability it takes advantage of. Zero-day exploits are almost impossible to stop, and it takes great skill to discover them. Weekold exploits are available to nearly everyone, but are preventable. Once you know about your vulnerabilities, you’ll need a process that defines how to evaluate the severity of each vulnerability, then how to determine what you need to do to close it. The countermeasures that you take should be based on an understanding of the risk that the vulnerability poses and should be implemented within a timespan commensurate with that risk. Your implementation process needs to take into account system uptime considerations; it’s neither feasible nor desirable to interrupt production server availability on a nearly daily basis to close vulnerabilities, which are discovered regularly. You’re better off grouping vulnerability management measures together during scheduled maintenance times to minimize customer business disruption. For malware, you’ll need to implement a process to detect, prevent, and recover from attacks. This process should cover: • Consistent use of agreed-upon malware detection software • Regular updates to that software at predetermined intervals • A process to supplement automated updates with manual updates • Reaction to confirmed malware attacks as part of the security incident management process Your organization can strengthen these measures making system users aware of threats on an ongoing basis. Your ISMS (often via policy or training) should prohibit your system users from installing untested or unlicensed software, either on their individual workstations or on cloud environments. This helps prevent new malware from infecting your 158


network and causing downtimes. Where approved and necessary, your ISMS should define how users can request and install authorized software. We covered patch management in Chapter 8 and malware prevention in Chapter 4. Both are integral to preventing malware and other network threats from getting a foothold in your cloud environment and corporate network. Both you and your provider must have policies in place to handle both preventative measures (patching) as well as incident response (anti-malware).

The information that flows through your networks, both in your LAN and in the cloud environment, has to be managed and controlled to protect systems and applications. The controls that can do that include firewalls, switches, and routers—virtual or physical, or both. They segregate networks into protected groups of information services, users, and systems. These devices must be configured appropriately according to device management requirements in your security policies. For those devices on your cloud environment, the protections and services that the provider offers should be defined within your network services agreements. Your ISMS should also define the protections in place around any connections your facilities have with the cloud environment and other third-party networks. Once your data leaves customer or cloud service provider environments, it can be subjected to more potential mischief during transit. If you plan on transmitting data over public networks, that data should be encrypted with network security protocols to reduce the risks of unauthorized disclosure and modification. These protections can help prevent incomplete transmissions, misrouting, diversion of information to unintended recipients, unauthorized information alteration, unauthorized information duplication, or replay attacks. Besides network communications, your formal policies, procedures, and controls should define protections on the transfer of information in all types of communication, including between individuals in the same location. Imagine talking with a co-worker over lunch at a restaurant. You could be overheard by anyone. Any electronic messaging needs to be protected, whether it’s email, Short Message Service (SMS), or webbased intranet forums. Anyone who comes in contact with sensitive

159

CHAPTER 10

Communications or Network Security


information that could damage your company should sign confidentiality or non-disclosure agreements to let them know the seriousness of their responsibility. On a regular basis, your organization should review and update them to reflect changes in legislation or communication methods, such as social media. We covered network security controls in Chapter 4, as well as discussing in-transit encryption such as Transport Layer Security (TLS) in Chapter 6. The controls involved in these areas are everyone’s responsibility, and include your internal LAN. Network security covers a huge domain, so work with your provider and your internal experts to determine the best policies and procedures to install in your organization.

CHAPTER 10

Encryption Encryption protects information at rest or in transit from being observed, modified, or exposed. A strong ISMS needs a policy that addresses what cryptographic controls your organization will develop and implement. This policy should address the type of encryption to use, when to use it, and what sort of information should be encrypted. Your organization should develop these policies based on how sensitive the data is and what exposure risks it has. Encryption can take up additional financial and computing resources, so your policy should address the organizational checks that can and should limit how you implement encryption. Along with the type and strength of encryption, the policy must speak to how to use, store, and protect cryptographic keys throughout their life cycle. If keys are not continually available, encrypted data could be lost. Attackers who gain control of your keys could hold your data for ransom, as is becoming more prevalent. Finally, the policy should require that cryptographic controls are used in accordance with relevant contractual and international agreements, legislation, and regulations. Some countries regulate the strength of encryption that can be exported. Chapter 6 covered encryption controls in depth. While SAP has a lot of fairly strong encryption available within its products, you may want to consider additional policy guidelines above and beyond what SAP offers. Ensure that your policies cover all sensitive data, include data stored in backups with your cloud provider.

160


The flexibility with which systems are created, deleted, purchased, and destroyed with regular frequency is one of the biggest advantages of the cloud. However, this flexibility poses a challenge to asset managers. To fully understand what you need to protect, your organization needs to take a full inventory of all assets associated with information and information processing. Because those assets change regularly in a cloud environment, your ISMS needs to specify how you’ll maintain an accurate inventory. Your organization needs a combination of nearly real-time scanning and clear communication with your cloud service provider to understand which systems are available and which are changing. You can make the job of identifying assets simpler by assigning clear owners to each, either by individual or by job role. Your policies should specify how that responsibility is maintained and how asset changes are reported. And when asset owners—employees or external users—terminate their employment, contract, or agreement, the ISMS should have provisions that require them to return all assets in their possession and describe how asset ownership transfers. Just as all incidents are not equal in severity, all data is not of equal value to you as a cloud customer. Your organization will need to classify information in terms of legal requirements, criticality, and sensitivity to unauthorized disclosure or change. This classification process will be your responsibility, not your provider’s, as you and your organization know your data best. Then, you can create and implement a standard classification labeling scheme so service providers and users alike can take appropriate measures to protect the data corresponding to its sensitivity. Before you hire a cloud service provider, you should ascertain whether the provider has a process to dispose of removable media, software licenses, and hard disks securely when they are no longer needed. As we discussed in Chapter 9, sensitive information can be recovered from old hard drives, sometimes even after they become unusable. The process should be documented and may be based on government or industry benchmarks and guidelines. Asset management, though not touched on specifically in any chapter, is a thread that runs through them all. Your information assets are the reason that you need security controls in your cloud environment, so

161

CHAPTER 10

Asset Management


any discussion of controls builds on an understanding of the assets you’re protecting.

CHAPTER 10

System Maintenance and Development Our fellow security practitioners like to say that the best security is “baked in” rather than “bolted on” after a system is developed. That means that security controls are developed as part of the system, instead of being treated as a supplement to the system after it is developed. To require software developers to consider security during, rather than after, the development process, you’ll need to establish rules for software and system development that include it as part of the process. If the software and systems are developed securely from the beginning, you can limit modifications to software packages to only necessary changes. That way, you can apply your change management process hand in hand with secure development. Any new software, software upgrades, and new software versions must go through acceptance testing programs that include testing of security features and functions. The test data must be created or selected to comply with your information security policies to protect against violations of data privacy regulations, which can arise if you use live production data. The methods that you use to anonymize or randomize production data for development and test use must themselves be tested to verify that they do, in fact, mask true information. If system enhancements require operating system changes, your organization needs to assess the impact of those changes on business-critical applications through thorough testing procedures. Whether this testing is the responsibility of customers or cloud service providers or both is usually spelled out in the service agreement between the two. The environment in which you develop software and systems demands security protections just as much as the development process itself. These development, test, and staging systems still need access controls, well-documented procedures for code changes, and testing as part of the system development life cycle. We touched on system maintenance and development concerns some in Chapter 8. It works hand in hand with a good change management policy, but deserves some thought as to how you create a development process that enhances security intrinsically, instead of as an afterthought. 162


We talked about the foundational triad of security earlier in this chapter: people, process, and technology. Within any ISMS, ensuring that your people enhance the security management process starts with background verification checks on all candidates for employment. The laws and regulations in each employment location and ethical considerations may limit the scope of these checks, but they are key to making sure your people work well with your security policies. Any background checks must be proportional to your business requirements and perceived risks. Once hired, an organization’s ISMS should establish and enforce contractual agreements with employees and contractors, which detail everyone’s responsibilities for information security. Your policies have no effect if the people behind them don’t enforce them. All employees and contractors must apply information security in accordance with the established policies and procedures of the organization as specified by these agreements. Your organization should maintain a security awareness education program that trains all employees on their responsibilities relevant to their job function. It’s not enough just to provide this training as part of a new employee’s onboarding program; every employee should regularly go through a refresher course. If an employee is found responsible for a security breach due to negligence in implementing security measures, your ISMS needs to lay out a formal disciplinary process to take action against these personnel. The disciplinary process needs to be communicated as part of its security awareness education to encourage employees to take security responsibilities seriously at all times. These contractual agreements should also outline information security responsibilities and duties that remain valid after they leave the company. This includes non-disclosure agreements, non-compete clauses, and other provisions that prevent data exposure. These responsibilities need to be communicated as part of the security awareness training, as they must be enforced if former employees or contractors breach their stipulations after they leave the company. While we briefly covered employee onboarding and termination controls in Chapter 3, human resources security falls into more of a policies and procedures area. Regular training sessions are absolutely key to the success of most of the policies described in this chapter. At IBM, we have 163

CHAPTER 10

Human Resource Security


security training refreshers yearly, as well as targeted information security training for roles throughout the year.

CHAPTER 10

Compliance When we talk about compliance in terms of information security management, we mean both internal and external compliance. Your security policies need to consider how your organization will comply with legal and regulatory requirements as well as with internal security policies and procedures. A cloud service provider must clearly identify and document evidence of compliance to all relevant statutory and regulatory requirements, both for its own records and for its customers. This covers audit reports based on globally accepted criteria, such as Service Organization Controls (SOC) 1 and 2 audits and ISO certifications. The compliance records, certifications, and audit reports must be kept up to date for the organization and each service offering to which they apply. Any cloud provider should explicitly identify which elements of its service have been tested and found to be compliant. You should expect all providers to have procedures to ensure compliance with legal and regulatory standards. These include protecting the privacy of personal information, respecting intellectual property rights, and licensing and using proprietary and open source software. Not all these areas may have associated certifications, but your provider should still be able to assure you that it complies with all relevant regulations and policies. You should review your cloud service provider’s approach to managing and implementing information security compliance at planned intervals or when significant changes occur. Managers should be able to review compliance of the information systems within their area of responsibility on a regular basis. Compliance is a pure policy control, so we haven’t talked much about it in the previous chapters. That doesn’t make it less important; in fact, compliance controls can help ensure that all the other security controls do their job and fit within the regulatory and policy landscape that affects your business.

164


Third-Party Vendor Relationships Your information security policies should cover how to mitigate the risks associated with a supplier’s access to customer or cloud service provider assets. In the normal course of business, you and your cloud provider’s vendors may gain access to sensitive data. These suppliers must agree to comply with their business partner’s security policies and requirements to protect assets to which they may have access. These agreements should be documented in contracts between the two parties, so any violations of security policy, especially those that cause financial harm, can be properly rectified. Supplier contracts and supplier compliance with contract provisions should be monitored and reviewed periodically. We haven’t yet talked about third-party relationships yet; that will be the contents of Chapter 12. Companies thrive when they can outsource business functions that lie outside of their expertise, so your organization may have several dozen—or more—vendors that have access to sensitive data. Your ISMS needs to manage their place in your security ecosystem.

In recent years, more organizations have developed and maintained a policy and supporting security measures specifically intended to manage the risks introduced by mobile devices. These policies and procedures can describe the configuration of employee mobile devices, how these devices can comply with organizational security standards, the controls to prevent unauthorized mobile devices from accessing an organization’s network, whether employees can install software for personal use on devices used for work-related tasks, and how to handle mobile device-based loss or compromise by unauthorized persons. The controls that we have discussed in most of the other chapters can often apply to mobile devices as well. Mobile devices offer less flexibility, so can be more secure, but are becoming more popular as targets. As we increase how much work we do on mobile devices, we need to increase our security focus for these devices.

165

CHAPTER 10

Mobile Security


CHAPTER 10

Disaster Recovery and Business Continuity Management Sometimes no amount of controls can stop a crisis or disaster from affecting your environments. No matter how much risk mitigation your ISMS contains, you may find your access to information assets impacted. While you may not be able to stop it, you can plan for the worst. Prior to the disaster, your organization must determine their requirements for the continuity and recovery—how daily operations will continue in the event of serious damage—of its businesses, related information and applications, and information security management. Once you determine and understand these requirements, your organization can develop a plan to provide a corresponding level of continuity. The safety of your personnel has to come first. Equipment can be replaced, but your people are what build your business. Ensure that you have proper emergency procedures and responsibilities. You can limit the damage a disaster causes to your equipment by providing redundant processing systems and facilities. You and your cloud service provider should verify the established and implemented business continuity controls at regular intervals to ensure that they are valid and effective. We talked a little about disaster recovery in Chapter 3. It’s not something most people find comfortable, as it plans for existential threats, but in the event that disaster strikes, it can mean the difference between organizational life or death.

Security Organization How your organization structures your information security department will vary based on the needs of your business and the available resources. Regardless, this department needs to define and assign information security responsibilities to maximize accountability and eliminate unnecessary work. A well-defined security organization segregates duties so that conflicting areas of responsibilities become more apparent. Limiting the privileges and overlap of privileges among security personnel reduces opportunities for unauthorized processing or other misuse of assets.

166


Security Policy Management Security procedures and processes are best rooted in an information security policy established by a customer or cloud service provider. The overall policy may cover a wide range of individual sub-policies that apply in different circumstances. This set of policies for information security must be clearly defined, approved by management, published, and communicated to employees and relevant external parties. At planned intervals or during significant business condition changes, the policies should be reviewed and updated, if necessary, to maintain their continued adequacy and effectiveness.

Your ISMS should be thorough and complete, covering the procedures and policies around the controls discussed in nearly every other chapter in this book. It’s easy to overlook the policy portion of security; that’s a mistake. Policy aligns every person in an organization around shared security goals based on job roles and business needs. Coordinated security is stronger security. In the next chapter, we’ll continue our discussion of policy to cover risk management. We gave a basic primer on risk on Chapter 2; in the next chapter, we’ll expand those ideas to include how they work in a cloud environment and how you can use an understanding of risk to create a secure cloud-based SAP system.

167

CHAPTER 10

Conclusion

CHAPTER 10


168

CHAPTER 11

Risk Management In today’s world, any networked environment faces an overabundance of threats and potential vulnerabilities. To defend against them all perfectly would bankrupt nearly any business or cloud provider. Success and security in this environment demand a comprehensive plan to manage risk. We define risk as a possible event that could cause harm or otherwise affect one’s ability to achieve objectives. And we measure it by three metrics: the probability that a threat will occur, the vulnerability of an asset to that threat, and the impact the threat would have if it succeeded. Risk management is the process of identifying, assessing, and reducing risk to an acceptable level. Once an organization has applied risk management techniques, the leftover, unmitigated risk is what an organization is willing to accept after all possible and reasonable steps have been taken. Deciding what steps are possible and reasonable based on your business needs and environment is the focus of this chapter. Some of the information provided here we’ve already mentioned in Chapter 2 of this book. In this chapter, we will dig a little deeper to understand that information, as well as introduce some new ideas to consider.

While the first step in managing risk requires identifying vulnerabilities in the environment, it is equally important, especially later in the risk management process, to first take an inventory of the assets owned by the business. For the purposes of our discussion, we’ll focus on technology and information assets as well as intangible assets that derive from them, such as company reputation. If you don’t accurately know your organization’s full asset inventory, you can’t correctly gauge the threats and vulnerabilities that can create risk. Technology assets include: devices (such as servers), workstations, network systems, and removable media; software (both licensed software 169

CHAPTER 11

Identifying Risk

and applications created by internal developers); and the tools and facilities that protect those assets. Information assets are the business data generated through daily operation and the intellectual property related to the functions of the business. It includes internally developed software code, customer data, financial artifacts—accounting data and financial statements—and product information. The intangible assets that derive from these assets include the reputation of a business with customers, the competitive position of the business within its industry, and the compliance of a business to legal, regulatory, or industry standards. Once your business has an inventory of all the potentially at-risk assets, you can begin identifying vulnerabilities. A vulnerability is a weakness or gap in the security measures of a business. Some common vulnerabilities include the following: • Exploitable deficiencies in hardware and software, including newly discovered security vulnerabilities that have not been patched or remediated • Misconfigured or unconfigured systems that create openings that attackers can exploit • Inconsistent process execution when building systems • Weak or insufficient security controls

CHAPTER 11

• Lack of diligence in following security procedures and guidelines However, a vulnerability cannot lead to a risk unless a threat exists that can exploit it. By threat, we mean a circumstance or event that could possibly harm an IT resource either by destroying it, disclosing it, modifying the data, and/or denying service with a denial-of-service (DoS) attack. Threats can be individuals, like hackers, who exploit a vulnerability for their own gain. Beyond individuals, threats can be the common strategies that hackers use that require consideration in and of themselves. These strategies include phishing attacks, social engineering, viruses, spyware or other malware, and, more recently, advanced persistent threats (APT). In an APT (also known colloquially as a “low and slow” attack), hackers try to gain access to a network or device so they can steal data undetected for a long period of time rather than cause immediate damage. Identifying threats is not as easy as we make it sound above. The threat landscape changes rapidly. To keep up requires an extensive amount of knowledge of the current environment and the conditions that could 170

Risk Management | 11

indicate a threat. Third-party companies and some cloud service providers such as IBM offer threat intelligence services to monitor and inform businesses of changes and developments in the threat landscape, which can give an organization a head start in defending against them. There’s a secondary benefit in having a thorough asset inventory: Not only can you identify the most pressing threats, you can identify the threats that you don’t need to worry about. For example, threats that target a type of software that your business does not own pose very little risk to you. Other examples of threats you can ignore include those that target businesses or sectors in which your company is not involved. Even if the threat does target software your organization uses, maybe it exploits a vulnerability that your business has quickly and successfully patched before the threat could be exploited. Eliminating potential threats frees you to concentrate on the ones that can actually damage your assets.

In its most elemental sense, risk is the intersection of assets, threats, and vulnerabilities. But that basic definition doesn’t give us the means to measure and compare risks so that we can address the most pressing challenges. In Chapter 2, we talked about how you can measure risk qualitatively on some scale. We used numeric values of 1 through 5 to represent risk levels, but you could use any number scale or terms, such as high, medium, or low. You can also measure the level of risk quantitatively with a more precise number that corresponds to the risk being calculated. Using either qualitative or quantitative methods, you can measure risk using a unit of measurement called the Single Loss Expectancy or SLE. It represents the amount of loss that an organization would sustain should a specific threat exploit a vulnerability and compromise or damage a specific group of assets. The loss may not only be monetary as noted above. It can also be reputational. The monetary losses can be immediate, subsequent, and longer term. The costs can reflect the value of the asset lost, the cost to repair a damaged asset, or the amount of a court judgment against a business after the loss occurs. Whether you use qualitative or quantitative risk assessment methods really depends on the level of information required by the person or group who decides where to allocate resources to mitigate risk. If the decision makers need justifiable and nearly exact dollar amounts of loss, 171

CHAPTER 11

Assessing Risk

11 | Risk Management

CHAPTER 11

you should use the quantitative method, even though it demands much more effort to complete. Otherwise, if the decision makers just need to understand relative risks, a qualitative approach works. Once you choose a qualitative or quantitative approach, you’ll need to use that method throughout the risk assessment process. Estimating the SLE qualitatively involves classifying the amount of loss sustained as a high, medium, or low dollar amount based on the impact that compromising or otherwise harming an asset can cause. Make sure whoever makes this calculation understands how this specific asset operates in the context of the larger business. Technical personnel have a great sense of how to protect an information asset; they may not know its full value to an organization. In Table 11.1, we’ve provided some examples of high, medium, or low SLE conditions. Note that different conditions could apply to many business sectors, and these are only a few possible examples. Condition

SLE level

Assets that control business processes that must be available and secure for the business to function

High

Assets that must be fully recovered within four hours of loss

High

Assets that contain sensitive personal information or personal health information as defined by applicable regulations

High

Assets that, when unavailable, impact important business objectives

Medium

Assets that must be fully recovered within 24 hours of loss

Medium

Assets that contain personal information

Medium

Assets that must be fully recovered within seven days of loss

Low

Assets used for development or test purposes only

Low

Table 11.1 Examples of SLE conditions

For the quantitative approach, someone more familiar with the financial aspect of business operations should calculate SLE levels. Here, for each type of asset loss, assign a dollar amount based on the percentage of company revenue, sales lost, litigation costs, or other financial measures. With this approach, make sure to document the assumptions you use to make the loss calculations. Without a general consensus on these assumptions, the values that you assign to risks will be less credible.

172


Once you know the SLE values for different loss conditions, return to the vulnerabilities and threats outlined above and estimate how likely it is that a given threat can exploit existing vulnerabilities. To help quantify this likelihood, you can use a metric called the Annual Rate of Occurrence (ARO). Determining the ARO of an asset requires deeper knowledge about the threats and vulnerabilities that it faces from a technological perspective. That’s not to exclude the more business-focused asset owners; the ARO is likely to be more accurate when you include that business knowledge. As a result, this process will probably involve more than one person. You can measure the ARO for a vulnerability using the following considerations: • How much skill is needed to exploit it? • How effective and easily available are the mitigating measures for the vulnerability? These considerations are easier to evaluate using qualitative measures. For each ARO, you can assign a value of high, medium, or low depending on whether (1), (2), and (3) above are high, medium, or low. Pick the most severe level; if one is high and the others are medium, then your ARO is high for this vulnerability. In the quantitative method of calculating ARO, you can still use the three conditions above, but you’ll need to draw on the expertise of evaluators and actual numerical threat data collected from multiple sources to derive a numerical ARO. Although this method is quantitative in that a number is the end product, there is much more subjectivity in assigning an ARO than an SLE. Experts can and will disagree on the three threat measurements above. The numerical threat data can vary quite a bit, depending on its source. Yet, you need a number for ARO because, quantitatively, it is difficult to make a calculation combining relative values such as high, medium, or low with dollar amounts. When you combine these two measurements—SLE and ARO—you’ll get a metric called the Annualized Loss Expectancy (ALE). This measures the expected cost of the losses attributed to this asset over the course of a year. If you use a qualitative approach, you can work out the ALE value according to the matrix in Table 11.2, which shows the results of SLE and ARO measurements in highs, mediums, and lows. 173

CHAPTER 11

• How motivated might an attacker be to exploit this vulnerability?


SLE ARO

High

High

High

Medium

Medium

Low

Medium

Low

Low

Table 11.2 Results of SLO and ARO measurements

The blanks in the table are intentional. To determine the qualitative value, a risk evaluator would have to make a subjective decision to decide if, for example, the combination of a medium and high resulted in a medium or a high overall, or combining a medium and low produced a medium or low overall, and so on. If you used the quantitative method to assess risk, then calculating the ALE will be simpler. Just multiply the SLE and ARO. You can use the product of these two numbers to evaluate this risk in comparison with others. When determining where to allocate security resources, you only need to see which risk has the greatest ALE value.

Managing Risk Now that we have assessed our risks, we can begin the work of managing them. At the beginning of this chapter, we stated that the finite resources available to any business limit the amount of risk mitigation that is possible. So in this step, your organization needs to decide whether each risk is accepted, mitigated, or transferred. Let’s review how we defined these terms defined in Chapter 2: CHAPTER 11

• Accepting a risk mean you take the full brunt of the loss if a vulnerability is exploited • Mitigating a risk means closing vulnerabilities, creating countermeasures, and minimizing exposure • Transferring a risk means finding someone else to pay for it. In general, that means insuring your assets against loss A myriad of factors go into managing risk. Not all your choices as to whether to accept, mitigate, or transfer risk will be directly related to SLE, ARO, or ALE values, regardless of whether you use quantitative or 174

qualitative methods to assess your risks. We’ll discuss some of the more abstract factors that go into these choices for the remainder of this chapter. The single biggest factor that determines how you manage an assessed risk is called risk appetite. In an organization, how much risk does the management feel comfortable leaving unmitigated and not transferred? Colloquially, how much of that risk are you willing to eat? Sometimes this risk will be left over after some mitigation measures to reduce risk have been taken. It’s very hard to eliminate risk, so at some point, your organization will have to accept some risk. Plenty of organizational factors determine risk appetite: the personalities of the people involved, the nature of the business being managed, the company’s position within that sector versus competitors, the value of business assets at stake, and more. One of the most complicated factors operating in risk management decisions is change. Any countermeasures taken to mitigate risk require change, and change is in itself a risk factor. In a volatile environment in which you have little certainty regarding business conditions, the threat landscape, or financial stability, your risk appetite normally decreases. In some cases, though, change can increase risk appetite, but only if your organization considers change as being positive. For example, if an organization phases out an outdated technology, the threats and risks related to it may be considered less important. Insightful organizations look to the future. If they have a good record previously predicting performance, they can influence risk managers to increase or decrease risk appetite based on what may happen in technology or the business years in the future. Change is an important consideration, but sometimes hard to take into account. Related to change, your risk management strategy needs to consider legal and regulatory environments and audit or certification standards. In a heavily regulated industry with complex laws and punitive enforcement, organizations may need to mitigate or transfer more risk. Failure to do so could impact the organization’s very survival or future success. Examples of industries with risky regulatory environments include healthcare-related businesses, financial institutions, and governmental agencies. Within these realms, the value of asset loss is so high and the threats so correspondingly menacing, that almost any vulnerability must be addressed.

175

CHAPTER 11


CHAPTER 11


Some certifications or audit standards, such as International Organization for Standardization (ISO) certifications or unqualified Service Organization Controls (SOC) 1 and 2 audit reports, require that you mitigate certain vulnerabilities and the risks they produce. For cloud providers, these certifications are table stakes, so the mitigation countermeasures they describe must be met. If an organization loses these certifications, it may suffer a loss of reputation. In the case of industry standards, such as the Payment Card Industry Data Security Standard (PCI DSS), not qualifying could prevent an organization from conducting business or receiving payment for its products and services. You need to strongly consider mitigating the risks that come under the jurisdiction of these standards when deciding how to manage them. There are also technical risk management factors to consider. Some software or hardware vendors require that you patch specific vulnerabilities or use particular software versions in order to receive support for their software. Some changes may violate their license and deny you support or worse. Technical assets are too valuable to businesses today to ignore the threats from software vendors when managing risk. A subtler consideration is how credible the risk assessments, their supporting evaluations and data, and the personnel who produced them are in the eyes of management. We cannot overstate the importance of the relationships security and compliance personnel need to have with executive management. They make risk management decisions and allocate precious resources. Are the security and compliance personnel seen as enabling business objectives and knowledgeable of priorities? Or are they the ones seen as getting in the way of change? Does management think they cry wolf and claim threats where there are none? Do they exaggerate vulnerabilities and threats? Past performance will determine the answers to these questions over time. Whenever security and compliance staff interact with executive management, they should keep in mind how they are seen in the context of the organization as they make risk management recommendations.

176


Conclusion

CHAPTER 11

While we’ve discussed all the elements above separately and distinctly from other elements in the risk management equation, in reality, they all interact and co-mingle. This makes risk management one of the most complex topics in security. Still, many security practitioners consider risk management the foundation of any security management system. This chapter’s intent has not been to provide a comprehensive compendium of every factor or combination of factors, but rather to start you on the road to understanding risk and creating a secure SAP cloud environment. In the next chapter, we’ll cover the additional security measures that you could optionally add to your cloud environment. Most of these require contracting with third-party vendors, though some may be purchased from your cloud provider. These services range from vulnerability testing to overall security governance for your vendors. These services are optional, but depending on your risks, they may give you the peace of mind that your assets are secure.

177

CHAPTER 11


178

CHAPTER 12

At this point, you should have a pretty good idea of the security challenges that your SAP system will face in a cloud environment. Cloud-based SAP systems offer a lot of benefits such as scalable resources, flexible pricing, and speed to market. But the cloud is a complex network of virtual servers, and securing that complexity can pose new and unfamiliar challenges. When you consider putting business-critical data in the cloud, you have to have a good strategy around it. Hopefully, with the mix of discussion on the threats and countermeasures, as well as a business objective focused risk management overview, you should have the start of that plan. But you may find that you want to add additional controls beyond those that your potential cloud provider offers. You may find that you don’t have the appetite for the risks to which you may still be exposed, or the regulatory environment may require something beyond what we’ve discussed. That’s where complementary services come in. Complementary services are security and management functions provided as an optional service by your provider or by a third-party vendor. Cloud services are most commonly a la carte—you pay for what you need. Providers can sometimes include optional services because they are required to address the risk profiles of specific customers that are not cost effective or needed as base services. Sometimes what you need must be acquired from vendors other than your provider. These vendors can add additional controls to your environment to defend against attacks, like transparent encryption or data leak prevention, or they can take an active, pre-emptive role in securing your environment, like penetration testing or threat intelligence services. We’ve touched on some of the possible complementary services throughout this book; in the remainder of this chapter, we’ll detail a few of the possible services you could contract 179

CHAPTER 12

Complementary Services

for, either with software or hardware vendors or with consultants. This is not intended to be a complete list of what’s possible; instead, view it as an introduction to what’s available. We’ll also talk about the risks associated with third-party vendors. Any new person, company, or software that you introduce into your cloud environment also introduces risk. That’s not an argument against third parties; that’s a reminder that managing risk is always a balancing act between business needs, costs, and the potential losses any breach could incur.

CHAPTER 12

Provider Optional/Third-Party Solutions SAP has a long history with incorporating and certifying third-party solutions. Sometimes that means it acquires the solution and rebrands it as an integrated SAP product. Sometimes that means a third party passes its certification, which gets a company listed in the SAP-certified solutions directory. SAP does the same with hardware, testing specific hardware configurations to determine how SAP software would perform on it. SAP offers another step beyond this for solutions that support, but don’t necessarily extend, the SAP system itself. For these solutions, SAP provides technical reviews to make sure that those products work well within SAP environments. An SAP system can be a resource hog and won’t necessarily play well with others, so ensuring that a third-party solution can run in a co-environment—that is, on the same physical or virtual server as an SAP product—can remove one more worry. At IBM, our CMS cloud-managed service manages SAP environments according to SAP standards. We have had a strong partnership with SAP for many years. Last year, we announced a plan to co-innovate solutions with it, so we take care to ensure that any additional services we include with an SAP cloud customer follow SAP’s guidelines. Third-party solutions could affect our certification with it, and as these solutions constantly change to reflect its customers and the computing landscape, those changes could affect our certification. Our provider optional offerings often come from internal IBM divisions. It’s everything from completely managed and pre-emptive security services to more a la carte offerings, such as penetration testing and threat intelligence. We’ll talk about what those services entail individually in this section, as well as cover some of the third-party protections that you might want to include that we don’t provide. 180

Complementary Services | 12

Preemptive Security Many of the controls we’ve discussed in this book offer reactive or defensive measures that prevent an attacker from exploiting a vulnerability. But some third-party providers offer services and products that take a more proactive approach, attempting to discover and act on threats and vulnerabilities before they hit your cloud environment. The networked world is a complex and ever-shifting landscape of innovative threats and newly discovered vulnerabilities, so the more lead time you get on them, the better. Your provider is responsible for providing you with standard security on its infrastructure, but you are responsible for protecting your data. Its infrastructure can’t protect the vulnerabilities in the software that it doesn’t manage, nor can it defend against threats it doesn’t know about. Threat intelligence services report on newly discovered exploits, vulnerabilities, and attack vectors. With this knowledge, you can allocate security resources according to what current threats develop. The best of these services employ security researchers who probe existing software and hardware for vulnerabilities in the hopes of discovering the dreaded zero-day exploit before the bad guys do. In addition, they’ll monitor the work of other researchers and black hat chatter in the darker corners of the web. These people are experts in the security world and provide valuable and usable data to security professionals worldwide. The next step up from knowing about vulnerabilities is discovering if they affect your environment. Vulnerability scanning services and 181

CHAPTER 12

Some of these services you could set up yourself. But we’ve found that it’s not worth our customers’ time to maintain home-grown tools and processes for activities that lie outside their core competencies. That’s what contracting with a cloud provider is all about, outsourcing your IT infrastructure to experts and taking economic advantage of that scalable infrastructure. So it makes sense to increase the value of that environment by outsourcing additional security services, especially if you don’t have the in-house expertise. In a cloud environment, it’s easy to add additional services to the application layer on virtual machines (VMs) that you control. These services operate on top of all the other cloud security controls, including the network and hypervisor. However, if you do add any of the services that we discuss below in a co-environment situation, let your provider know so it can maintain accurate test and support plans.

12 | Complementary Services

CHAPTER 12

software will examine your applications and/or your platforms (such as the operating system and database) to find and classify known vulnerabilities. These vendors often take their information from threat intelligence services, so not only will you have the best information about the threats that could specifically affect you, you can get recommendations as to how to close these security holes. Vulnerability scans can provide early warnings on misconfigured software, critical patches, and potential weaknesses in your control scheme. If you want to go even further and subject your security infrastructure to real-world scenarios, you can hire penetration testing consultants. These white hat hackers will attempt to bypass your existing security controls to see how far into your network they can get. By attempting to exploit and subvert your system from the mindset of a malicious attacker, they can demonstrate exactly how your system could be exploited. Once they identify these security gaps, they can then give you the tools to plug them up. Nobody knows how to close a hole better than the person who just jumped through it. Note that some regulations require either vulnerability scanning or penetration testing. Many of the regulations and standards that specify individual security controls include vulnerability assessment requirements. You can meet these through either vulnerability scans or penetration tests. Authentication If you find the existing SAP authentication methods insufficient or too complex, then you may want to seek an outside authentication service or provider. These can provide additional log-in capabilities or enable security on customized interface applications. Through trusted authentication methods, you can add a layer of protection that passes credentials to the servers. Third parties can add multi-factor authentication methods outside of the SAP two-factor mobile application. Your cloud provider will likely not be able to add this functionality, so if it is something you’re interested in, a third-party vendor will likely be your only choice. Most use mobile-based one-time code methods, though others could connect with radio-frequency identification (RFID) readers or biometric scanners.

182


Data Traffic Controls Within the cloud-based network on which your SAP system resides, you may want to place additional controls on the flow of data. A good cloud provider will secure traffic through firewalls and network segregation. On your end, you should always use Secure Sockets Layer (SSL) or Transport Layer Security (TLS) to transfer data, preferably over VPN secured lines. But there are other traffic controls that you can add to secure data going in or out of your environment. You can, as we mentioned in Chapter 6, add encryption beyond what SAP provides. Most of the time, this will be transparent data encryption. This runs in the background, automatically encrypting anything that passes through its protocols. This can be an additional TLS safety net for applications installed on the SAP environment, or it can be a wall that protects every single piece of data in or out. You may find this useful if your SAP system includes additional objects or integrations that transfer information—for example, an application that sends emails automatically whenever it processes a customer order. Some data is so sensitive as to merit additional measures to prevent it from leaking out, either inadvertently or intentionally. Data leak prevention tools go beyond just blocking outside threats; they can monitor sensitive data in use, in transit, and at rest to make sure that both outsiders and insiders aren’t sending your data to points unknown. Insider data leaks are difficult to prevent with standard controls. A well-configured data leak prevention tool can spot suspicious activity, often across all network endpoints and applications. The downside of these tools is that they often do require quite a bit of configuration and customization to work effectively.

183

CHAPTER 12

If your SAP system processes credit card information, you may want to consider a tokenization service to increase the security of cardholder information. Tokenization is an encryption method that replaces all sensitive credit card information with a token string that can be used to retrieve the card information from a central database. It helps meet the Payment Card Industry Data Security Standard (PCI DSS) and secure sensitive information. Because an SAP system is business critical for so many companies, many of these tokenization solutions are integrated into SAP products.


CHAPTER 12

In Chapter 4, we talked about distributed denial-of-service (DDoS) attacks, which are sudden spikes of traffic designed to prevent access to a networked asset. While firewalls and other standard network defenses can sometimes stop these, other times that traffic may overwhelm the firewall itself and bring that networked asset offline. You can add managed DDoS defenses, software or cloud-based services that automatically prevent these attacks from affecting your system’s performance or availability. Some of these are pure software solutions, while others are backed by live specialists who respond to attacks and shift harmful traffic to safe zones. Event Detection and Response Even with the best security controls, some threats may get into your network. If and when that happens, you need some way to detect, identify, and respond. There are several third-party applications and services that can handle this that range from signature-based monitoring to after-thefact forensic investigation of security events. The simplest of these we mentioned in Chapter 4: network intrusion detection and network intrusion prevention systems (NIDS or NIPS, most commonly). These should be part of any cloud-based security toolkit, but are not always included in the standard security package. Basic versions of these include anti-virus and anti-malware software. High-end versions can be network-based appliances that look for known attack signatures across their entire network. Both high- and low-end versions should catch known intrusion methods when they hit the host or network on which they are based. Existing network infrastructure can generate a lot of log information—every hit on a firewall, every packet router, and every NIDS signature match. Security Information and Event Management (SIEM) software collects and aggregates all these logs in a single location and combines the information that they produce. It’s not quite intrusion detection, but it’ll help you match events across disparate infrastructures and systems to recognize a single security event. These applications help your IT support personnel reconstruct security events after the fact to determine what happened. This can reveal whether you’ve experienced a breach that needs to be disclosed or just have a nosy search indexing robot sniffing the wrong ports.

184


Disaster Recovery Services Disaster recovery services cover anything that helps you get back in business after a catastrophe. It could be as simple as weekly tape backups or as complex and involved as business continuity consultants who help you build an entire plan to survive whatever unexpected calamity strikes. Your provider may offer basic recovery services in the form of backup protection for your cloud environment. But the default version of that may only include backups of the operating system and configuration, not the database. You’ll almost certainly want to have some backup copies of your database in case of catastrophic failure. In this case, you may be able to contract with your cloud provider for data backups as an additional service.

185

CHAPTER 12

These consolidated log management tools are more useful that you’d expect. Plenty of regulations require that you keep an archive of logs and audit trail information. By storing them in a single secure area, you can access them faster for archiving or analysis. We recommend that all our customers use these services. It causes very little impact on your system performance and can simplify your support team’s job tremendously. More comprehensive SIEM services include personnel who monitor events as they happen in your system. They then correlate the events in process in nearly real time to spot intrusions. A SIEM with real people behind it helps locate advanced persistent threats before they have a chance to cause serious damage. These products and services help you discover and react to security incidents on your own. However, if you get hit by a significant security event, you may need additional expertise to react and recover in a timely manner. You may consider contracting with emergency response services, which provide end-to-end incident management and recovery services. They can often combine forensic investigation, coordinated recovery efforts to get your back online fast, and security analysis to learn from the attack and prevent it from happening in the future. This service requires a very specialized skill set; at IBM, our CMS team has worked with our own internal team to handle any potential incidents. These teams are incident experts and can effectively coordinate delivery teams across the spectrum of services to solve issues.


CHAPTER 12

However, you may find that you need to store backup databases for longer than the provider allows. Or you may want to take advantage of specialized backup services through another cloud-based provider. In this case, you can contract with these providers to create secure channels where you can transfer your data to their data warehouses for longer storage. If you need to archive data—that is, keep copies of it for potential long-term reference, not as a short-term failsafe—you can find providers who will store this data for you at a lower price point than backups. Backup storage has to be available for reading and writing. Archive data may be read only, stored on cheaper high capacity media with lower I/O transfer rates. It may be stored as offline object-based storage—a box of tapes in a storage warehouse. Some regulations require that you keep archived data for a specific amount of time. The Sarbanes-Oxley Act requires that publicly-traded companies store all business records and communications for five years to prove that they have not been falsified. Companies that deal with personal health information and any parties that may handle this data, including cloud providers, must keep that data securely for six years. The PCI DSS mandates that you don’t store cardholder data unless absolutely necessary, but requires that anyone transmitting or storing card data maintain audit trails for at least one year. If you want additional security and control of your backup media, you can hire secure transport services that will physically pick up your backup tapes from their origin and deliver them to your storage facilities. These services should provide full chain-of-custody documentation so you know exactly who handled the media and when. You can engage them to store the media for you as well. For data that must be stored for a longer period of time, but not accessed, this can be an affordable alternative to always-on cloud storage. You can combine these into a complete disaster recovery program or as part of a business continuity plan. Both of these can involve some amount of consultant-led planning, scenario testing, and gap analysis.

186


Disaster recovery services tend to focus on storing and recovering data in the event of total failures, while business continuity consultants go beyond data preservation to craft policies and plans that ensure the business can continue to operate in drastic environments. Both provide best practice recommendations and customized resiliency services that help your organization maintain operations after the worst case scenario.

Third-Party Risks

187

CHAPTER 12

Third-party services are not without their downsides. Every new product or service you add to your cloud environment increases the costs and overall complexity and decreases system performance. Every new piece of software needs computing resources, every service needs additional user access points, and both need management attention. When considering new third-party vendors to complement your cloud security, weigh the benefits against the risks to ensure you stay within business objectives in a cost effective manner. As all these complementary services are above and beyond the standard cloud provider service—often, they are provided by entirely separate vendors—they incur additional costs. Consider the value of the assets that these providers may protect. Are you spending more money than they are worth? Is this an efficient addition to your security controls or does it add only slight benefits? However, if you find that the cost of the security controls exceeds the value of your asset and still feel uncomfortable about not securing it in this manner, consider that you may be undervaluing your asset, especially in terms of the intangible value. An SAP system can be testy, performance-wise, as we’ve discussed. Additional co-environment software can draw resources in addition to your SAP products. This could hamper performance, causing delays and costly work slowdowns. Or you could put these additional security measures on separate VMs, depending on their function. That, too, will increase your cloud hosting costs. Both of these potential cost increases must be factored into the risk countermeasure calculation. The more additional services and vendors that you add to your cloud environment, the more complex your security landscape becomes. Each piece of software that you add offers attackers a new potential attack


CHAPTER 12

surface, as it may have unpatched vulnerabilities. Those software products need to integrate with your change management plan and may increase the amount of testing necessary. Each new vendor or service you provide with access creates new user entry points, which means new ways that malicious individuals or code could piggyback on their access. Both of these need to be added to your security audits and checklists. These services also pose more complex management problems. Your vendors may have different security levels than your organization, or offer inconsistent service levels in their contracts. Unless they agree to follow your policies, they could jeopardize your certifications and regulatory compliance. If something goes wrong, they make determining the cause and resolution more difficult—as we say, “You no longer have one throat to choke when things go wrong.” As a consultant, one of the authors of this book worked as a security service vendor for a large auto manufacturer. This company had around 20 third-party vendors providing security, which can be a nightmare to coordinate. Its solution—and yours, if your environment grows complicated—was to hire our intrepid author’s company to additionally provide security policy governance and oversee the security policies and procedures of all vendors. In assembling a team of complementary services, you may wonder whether to hire individual companies for each service or a larger, overarching company to provide all security. Both have their benefits and drawbacks. Individual vendors may be specialists, able to provide deep expertise in an area that requires it, while increasing your system complexity. A single or umbrella provider will simplify your governance and may provide bundling discounts, while possibly moving more slowly due to increased corporate bureaucracy. If you have strict regulations that apply to your security requirements, you may want to hire a compliance specialist to handle that. For us, if we had to comply with PCI DSS standards, we’d hire someone to do that for us. Compliance with regulations is always the customer’s responsibility. Your provider may be able to help, but it’s not its responsibility. Failing to comply with these can be costly, so measure the risk of non-compliance with the cost of having a specialized vendor cover that for you.

188


Conclusion

189

CHAPTER 12

You have a lot of options as to what extras you add to your cloud environment’s security portfolio. These can deliver expert controls and knowledge or simple convenience. But it’s always a tradeoff; the more people and software with privileged access, the more complicated security will be. But those are the tradeoffs you need to consider when managing potential security risks. As this is the last chapter, we hope you gained a good understanding of the technologies, challenges, and countermeasures that surround cloudbased SAP systems. If you are still shopping for a provider, you should be able to ask the right questions to see how its service stacks up against the competition. Cloud computing always has risks, but armed with the knowledge in this book, you should be able to manage it well enough to take advantage of the economic benefits that the platform offers. Happy computing!

CHAPTER 12


190

GLOSSARY

GLOSSARY Above the hypervisor

Part of the division of a server in a cloud environment. This portion includes everything that runs in a virtualized environment, including the guest OS, your SAP system, and some virtualized networking servers

Active Directory

A Microsoft directory service for Windows deployments. Along with other capabilities, it acts as an authentication and authorization utility for Windows users and servers collected into a construct of domains. Active Directory can also provide a broad set of security relevant administrative tools such as certificate management, federated services, and domain name service

AD

See Active Directory

Advanced Encryption Standard

A popular encryption algorithm used by the U.S. government and other countries

AES

See Advanced Encryption Standard

Anomalous flows

Patterns in data transfers and interactions that are outside of the normal flow of operations

Asymmetric keys

Used in public key encryption. Indicates that encryption and decryption use different keys

Attack surface

The total number of potential entry points where an attacker could exploit vulnerabilities

Audit trail

A log of actions performed in a system and the user who performed them. In the event of problematic data states, the audit trail allows administrators to trace the actions that led up to the event and potentially identify compromised users.

Bare metal server

A computing environment where the virtual machine operates directly on the server hardware, instead of being managed by a hypervisor. In cloud computing services, contracting for a bare metal server will provide you with the resources of the entire server, which you manage yourself

Bring your own device

An IT policy that allows users to connect their personal mobile devices and laptops to the organization's network

Brute force attack

An attack method that attempts to guess passwords or keys by trying every possible value

Bursting

When an SAP process exceeds its roll area memory and takes up its extended memory allotment

Business Continuity Plan

A fully realized strategy, which includes a disaster recovery plan, to ensure that the people, assets, and business functions can continue in the event of some level of loss

BYOD

See Bring Your Own Device

191

GLOSSARY

Glossary

Chain of custody

Documentation that details the complete list of people who handled physical media that may contain sensitive data or data which may be included in a legal proceeding or forensic investigation

Ciphertext

Encrypted data; This cannot be read without passing it through a decryption function using the proper key

Cloud

A virtualized computing environment where individual clients can buy processing, storage, and memory in variably sized increments and access them through a network connection

Cloud-Managed Services for SAP

IBM's managed cloud offering, with special focus on hosted SAP solutions

CMS

See Cloud-Managed Services

Complementary services

Software or services above and beyond the standard offerings from a cloud provider; These can be offered by either the cloud provider or a third-party vendor

Controls

Information security countermeasures designed to reduce overall risk

CRAC Unit

Stands for computer room air conditioning; A high-powered cooling unit used in data centers

Data at rest

Data assets held in storage media

Data bleed

When data is accidentally exposed to other virtual machines or network users

Data center

The location in which a provider or customer houses the physical hardware that composes a cloud environment

Data in motion

Data being transferred between one endpoint to another over network infrastructure

Data in use

Data being actively used, transmitted, and modified by an application

DDoS

Short for distributed denial of service. A denial of service attack launched from multiple origin computers, often bots under the control of malware

Degauss

The process of scrambling hard disk drives using a powerful magnet

Denial of Service

When an attacker attempts to flood a networked endpoint with traffic in order to prevent others from using it or take it offline

Dictionary attack

A brute force attack that uses a list of known strings in an attempt to access a password

Disaster Recovery Plan

A documented process and policies to help an organization recover data and IT assets from disasters quickly and efficiently with minimal loss. See also Business Continuity Plan.

DNS

See Domain name services

Domain name services

Translates URL names into IP addresses according to the most recently propagated information

192

Encryption

A process for encoding data or communications channels through the use of mathematical algorithms to ensure that only authorized persons or resources can view and/or manipulate the data. Encryption is a primary utility used in ensuring confidentiality and integrity on computing systems

Encryption key

A piece of data used as part of an encryption algorithm to uniquely scramble the target text

Extended memory

An extra memory allotment which SAP processes can use if needed, a process called bursting

Firewall

A network or host boundary control mechanism. Firewalls provide a granular level of control on network and data flows to a specified destination. These controls can be established based on destination, port, protocol, flow characteristics, authentication, and authorization and can often be implemented to perform some degree of threat identification and prevention

Frames

A data packet transported over OSI layer 2 to computers on the same local network

Handshake

A negotiation process between two data transfer points where they determine what algorithms they will use to connect and secure the data transmission

Hard disk drive

Storage media that uses a magnetized metal plate to read and record data using a moving arm

Hardened

Describes a computing environment that has been protected against attacks

Hash

An encryption algorithm that always encodes data of any length into fixed-length ciphertext

HDD

See Hard Disk Drive

HIDS

Stands for Host-based Intrusion Detection Systems, commonly deployed as a specific software function on a specific host computer endpoint. HIDS analyze, monitor, and detect specific connections and network flows to an endpoint and alert administrators based on specified conditions

HIPAA

The Health Insurance Portability and Accountability Act, which mandates security requirements around sensitive personal health information

HIPS

Stands for Host Intrusion Prevention Systems. Similar to HIDS, this is a software-based functionality deployed on target computers that analyzes, monitors, detects, and prevents target attacks based on policy. It can alert administrators of threats; however, the key aspect of HIPS is the policy enablement to provide the active prevention of inappropriate flows or interactions with the endpoint

HITECH

The Health Information Technology for Economic and Clinical Health Act, which mandates security requirements around sensitive personal health information

193

GLOSSARY

Glossary

GLOSSARY

Glossary

Host

The physical computing resources on which virtual machines run on.

Hybrid Cloud

A cloud environment that is made up of two or more other IT environments, such as public or private clouds or non-distributed on-premise computers.

Hypercalls

Operating system calls as filtered through certain types of hypervisors.

Hypervisor

A software or hardware virtualization technology that allows a specified physical computing resource to be partitioned into many virtual computing resources. The hypervisor isolates each VM from the others in both computing resources and processing cycles and allocates resources among them.

Hypervisor and below

Part of the division of a server in a cloud environment. This portion includes everything that enables virtualization, including the hardware and networking infrastructure.

I/O interrupt

The hardware address of an installed device, like an Ethernet card.

IaaS

See Infrastructure as a service.

Information security management system

A collection of policies and procedures an organization creates and uses to understand their risks and manage security to mitigate those risks.

Information Security Management System

A framework of policies and procedures established to protect the confidentiality, integrity, and availability of an organization’s data.

InfoSec

Short for Information security.

Infrastructure as a service

A cloud computing layer that provides increments of computing power without the customer having to purchase any hardware.

International Standards Organization

An international organization that creates standards documents for a wide range of industries and processes, including several about information security.

Intrusion Detection System

Any software, hardware, or service that attempts to identify intrusion attempts by matching them with known signatures and then alerting administrators. .

Intrusion Prevention System

Any software, hardware, or service that attempts to identify and prevent intrusion attempts by matching them with known signatures.

IP address

The unique string of numbers that identifies a computer connected to a network using the Internet Protocol (TCP/IP). This value will be assigned by the network controller that the computer connects with.

IPSec

A transparent encryption method that secures TCP/IP packets in their entirety. It's often used to enable VPN connections.

ISMS

See Information security management system.

194

ISO

International Organization for Standardization, an independent, non-governmental, international organization whose members bring together experts to share knowledge and develop voluntary, consensus-based, market relevant international standards that support innovation and provide a benchmark for businesses engaged in international trade. There are approximately 21,000 standards of which the most relevant for this discussion is the ISO 27000 series pertaining to Information Security Management Systems.

Kerberos

A single sign-on authentication protocol that verifies identities using secret keys.

Kernel

The central portion of the operating system that controls the operation of all other parts.

Latency

The amount of time delay between a computing or network request and its receipt.

LDAP

See Lightweight directory access protocol.

LDAP

Lightweight Directory Access Protocol is an open source standard application protocol for maintaining and accessing directory information. It is commonly used for user and system authentication and authorization activities.

Logging

A software feature that makes notes when certain activities occur in that software. Can help identify security events.

MAC address

The media access control address, a unique identifier assigned to the network interface for a single computer or virtual machine. Unlike the IP address, this identifier will not change unless a virtual machine is deprovisioned or the physical networking hardware is replaced.

Malware

Any program whose intent is to cause some sort of harm to the system on which it is installed, either by stealing data, configuring botnets, or other actions. These programs often propagate themselves as viruses and use subterfuge to install on target computers.

Man-in-the-middle attack

When an attacker inserts themselves in between you and your data’s destination, intercepting anything sent.

Managed infrastructure

A type of cloud computing service where all physical resource allocation and maintenance is handled by the provider.

Managing infrastructure

The process of allocating resources to virtual machines.

MPLS

See Multiprotocol label switching

Multiprotocol label switching

A private network circuit between two endpoints setup by a telecommunications provider. MPLS for short.

Multitenancy

The condition where the virtual machines of more than one cloud customer use resources located on a single piece of hardware.

NDA

See Non-disclosure agreement.

195

GLOSSARY

Glossary

GLOSSARY

Glossary

Network segmentation

The process of isolating networked servers and other appliances from each other into distinct subnets.

NIDS

Stands for Network Intrusion Detection System. A software-based or hardware appliance that monitors an entire network infrastructure for known attack signatures, then alerts an administrator. See also NIPS, HIDS.

NIPS

Stands for Network Intrusion Prevention System. A software-based or hardware appliance that attempts to identify and stop malicious attacks over an entire network infrastructure.

NIST

National Institute of Standards and Technology. A governmental agency within the U.S. Department of Commerce. The agency was established to create and maintain standards to measure the effectiveness of various scientific and business processes. NIST has published 970 cybersecurity articles and standards to date and has established a national Computer Security Resource Center. Its cybersecurity framework is the foundation for standards required for securing government data and is a best practices guide for many businesses and industries placing a high priority on security of technology assets and data.

Non-disclosure agreement

A contract between two entities that ensures that one party will not disclose sensitive information given during the course of employment or business.

Notes

What patches, upgrades, and feature releases are called in SAP. These come with a short article that explains them and their effects.

On-premise

A computing environment located in your facilities. This environment can be virtualized - an on-premise cloud - or a traditional physical server infrastructure without virtualization.

OSI Network Model

Short for Open Systems Interconnection model. A model that describes the layers on which information travels on telecommunication networks.

Overcommitting

When a hypervisor allocates more resources to the VMs under its control than is physically available.

OWASP

Stands for the The Open Web Application Security Project, an online community that researches web application security.

PaaS

See Platform as a service.

PCI DSS

The Payment Card Industry Data Security Standard, which an organization must comply with if it handles sensitive cardholder data.

Penetration testing

A service that tests your security controls by attempting to bypass them in live situations. They can demonstrate exactly how your system could be exploited

196

PHI

Short for personal health information. Security around this information is strongly regulated by laws like HIPAA and HITECH in the US.

Phishing

A attack vector that attempt to fool its targets into installing viruses or other malware by sending emails that look legitimate.

Ping

To query a URL or IP address to determine if you can connect to it.

PKI

See Public key infrastructure.

Platform as a service

A cloud computing layer that provides an operating system environment without needing the computing hardware to run it.

Private Cloud

A cloud environment where computing resources are reserved for a single customer and isolated to prevent multitenancy issues.

Production server

The live SAP system that users run workloads in on a daily basis.

Public Cloud

A cloud environment where computing resources are shared between customers.

Public key encryption

A cryptographic method that uses asymmetric keys in which the encryption key is widely available, which the decryption key is closely guarded.

Public key infrastructure

The roles and storage areas used to manage shared keys.

QA

Stands for quality assurance. This is the process of testing software and system changes before they go into live production servers.

Rainbow table attack

A brute force attack that attempts to guess passwords by generating hashed values and matching those with the hashed passwords stored on the target server.

Redrum

Data Destruction

Redundancy

Disaster protection controls that use duplicate computing resources, utility cables, and/or data centers to ensure continued operation in the event of catastrophic failure.

Ring 0

The most privileged layer within an operating system, which has direct access to hardware memory and CPU processing threads.

Roll area

The initial amount of memory an SAP system assigns to a process.

Rootkit

Malware that attempts to silently gain access to privileges that it otherwise wouldn't, for example, bypassing user authorization or administrative rights protections.

197

GLOSSARY

Glossary

GLOSSARY

Glossary

Router

A networking device that directs data in motion traffic to the correct endpoint. These devices can also provide firewall, port forwarding, or logging capabilities.

RSA algorithm

A common asymmetric key algorithm.

SaaS

See Software as a service.

Salt

Arbitrary data added to an encryption process to frustrate attackers.

SAML

A single sign-on authentication protocol that uses XML-based data exchanges to validate identities.

SAP basis

An SAP database management platform.

SAPS

SAP standard application benchmarks. 100 SAPS are equivalent to 2,000 fully processed order line items per hour.

Sarbanes-Oxley

A US law that specifies security requirements for publicly traded companies.

Script kiddie

A low-skill hacker who uses repeatable sets of commands, called scripts, to search for and exploit weaknesses on Internet-connected computers

SDLC

The software development lifecycle

Secure Network Communications

An SAP protocol for remote function calls that integrates cryptographic capabilities

Secure store in the file system

An encrypted location in your SAP implementation that stores keys and other sensitive data

Security Information and Event Management

Software and/or services that collect and aggregate logs files in a single location and combine the information that they produce into a single database. Some of these services include personnel to monitor event activity in real time

Service-level agreement

A contract between a cloud provider and customer that defines services, responsibilities, and uptime expectations

SIEM

See Security Information and Event Management

Single sign-on

An authentication method that stores some sort of validation token or uses trusted identities to allow users to sign on automatically to their SAP system after an initial sign on using user name and password

SLA

See Service-level agreement

SNC

See Secure Network Communications

SOC 1

Audit report that assesses an organization’s compliance with controls contained in the Statement on Standards for Attestation Engagements (SSAE) number 16 developed by American Institute of Certified Public Accountants (AICPA). It measures the strength of security controls on financial information assets

198

SOC 2

An audit report on controls at an organization relevant to the AICPA Trust Principles of security, availability, processing integrity, confidentiality and/or privacy. The report measures the strength of security controls on non-financial information.

Social engineering

A method that attackers use that manipulates the people involved in information security instead of the technology.

SoftLayer

A flexible and powerful cloud hosting provider acquired by IBM in 2013.

Software as a service

A cloud computing layer that provides functionality, data access, and GUI interfaces entirely through a network application, like a web browser. All software-related maintenance, like installation and updates are handled behind the scenes by the vendor.

Solid state drive

Storage media that records data in integrated circuits. Compared to hard disk drives, they are faster, quieter, and have no moving parts.

SQL

See Structured Query Language.

SSAE 16

Short for Statement on Standards for Attestation Engagements 16. An auditing standard used to produce SOC 1 audit reports.

SSD

See Solid state drive

SSFS

See Secure store in the file system

SSL

Short for Secure socket layer. This can either refer to the web encryption method replaced by TLS or to both SSL and TLS as web encryption methods.

SSO

See Single sign-on.

SSO

See Single sign-on.

Structured Query Language

A database query language that is the standard for relational databases. Commonly known as SQL (pronounced "sequel").

Symmetric keys

Indicates that encryption and decryption use the same key.

T-shirt sizing

An SAP VM sizing method that bases resources on standard sets of specifications.

TCP/IP

Short for Transmission Control Protocol/Internet Protocol. The primary point to point transmission protocol used in Internet traffic. It can also refer to a networking model that uses a simpler set of layers than the OSI model.

TLS

Short for Transport Layer Security. This web encryption method replaced SSL as the primary secure Internet transport protocol

Tokenization

A data sanitization method that replaces any sensitive information with a token string that can be used to retrieve the information from a central database

Two Factor Authentication

An authentication method that relies on two factors: something the user knows, usually a password, and something the user has, which could be a specific mobile phone, biometrics, or identification card. 199

GLOSSARY

Glossary

GLOSSARY

Glossary

Uptime

The amount of time that a server is online and available for use

Virtual Local Area Networks

A networking domain created by subdividing virtual machines through isolation using firewalls and other network controls. These segregated virtual machines will broadcast frames to each other and be able to see each other on the network as if they existed on the same LAN

Virtual Machine (VM)

A virtualized computer running entirely on a hypervisor

Virtual private networks

An encrypted communication method that allows private networks to extend over public communication lines

Virtualization

A computing process that puts a software or hardware intermediary, the hypervisor, in between a guest operating system and the computer hardware to prevent direct access to and allow multiple virtual machines to share the same hardware

VLAN

A virtualized local-area network. This helps isolate virtual machines from each other while running in the same cloud infrastructure

VMM

A virtual machine manager. Used interchangeably with hypervisor

VPN

See Virtual private networks

Vulnerability scanning

Services and software that examine your applications and/or your platforms to find and classify known vulnerabilities

Wear leveling

The process solid state drives use to move data around its storage areas in order to cause even wear across all physical storage locations

Wide-area network

A computing network where not all of the computers are located within close proximity.

Wrapping encryption

Encryption protections on the keys used for other encryption methods

XML

Short for eXtensible Markup Language. A tag-based language designed to classify data in a machine and human readable format

Zero-day exploit

An exploitable vulnerability that has been recently discovered and has not been patched or mitigated

200

SAP in the Cloud - Security Essentials

Recommend Documents