Everything You Need to Know About Oracle Exadata Backup and Recovery: Best Practices Andrew Babb, Consulting Member of Technical Staff, Oracle
Donna Cooksey, Principal Product Manager, Oracle Harpreet Singh, Vice President, Database Management, Fidelity Investments
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Program Agenda Evolving IT Infrastructure Recovery, Recovery, Recovery Architecting Your Backup Infrastructure Customer Case Study – Fidelity Investments New Modern Cloud Paradigm Summary and Q & A
3
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Evolution of Data Protection Business Requirements Meeting IT Head-on IT consumers are increasingly involved in technology decisions – The flexible, fast moving opportunities of the “3rd Platform” translate to
more IT initiatives being driven by Line of Businesses (LOB) – Applications, storage, servers … even data protection?
Technology in stealth mode makes a sound data protection even
more important ! Greater Complexity Causing More Data Center Downtime: http://www.datacenterdynamics.com/focus/archive/2012/09/greater-complexity-causing-more-data-center-downtime-0
4
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Critical Databases Get Poor Protection Today What Business Wants
What Business Gets
Never lose business data
Data loss on restore, typically full day
Keep critical apps available
End-user slowdown during backup
What IT Wants
What IT Gets
Private and public cloud solution
Sprawl of non-scalable solutions
Ensured end-to-end protection
Uncertain protection, poor visibility
5
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Primary Causes of Downtime 2012 IOUG Survey – Enterprise Data and The Cost of Downtime* Recovery plan / Training / Oversight
Unplanned Downtime
Human Error
*http://www.oracle.com/us/products/database/ 2012-ioug-db-survey-1695554.pdf
6
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Storage Failure
Interoperability / Scalability / Performance
Server Failure
Failover / Fallback capabilities
Network Outages
Application Errors
System Monitoring
Bad News Travels Faster Than Good What is The Cost of Downtime? NASDAQ HALTS TRADING FOR THREE HOURS: http://www.businessinsider.com/nasdaq-options-market-halted-2013-8 NASDAQ HALTS TRADING FOR THREE HOURS: http://www.businessinsider.com/nasdaq-options-market-halted-2013-8 NASDAQ HALTS TRADING FOR THREE HOURS: http://www.businessinsider.com/nasdaq-options-market-halted-2013-8
7
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Recovery, Recovery, Recovery
8
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
What are Your Recovery Requirements? Four Key Points to Define Recovery Point Objective (RPO)
1
Retention Period
2
3
Recovery Time Objective (RTO) 9
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
4
Disaster Recovery (onsite/offsite)
Group Databases Into Protection Tiers Basic Grouping Strategy - Example Category
Gold
Silver
Bronze
RTO
Seconds
< 6 Hours
Up to 24 hours
RPO
Current
Up to 3 hours*
Up to 6 hours*
Critical Restores
Up to one week
One day
One day norm / not critical
Retention
7 Years
6 months
1 month
DR / Longterm
Two sites for one week
Offsite copy within 3 days
No specific DR requirement
Backup Retention
*Stay tuned to the new paradigm.
10
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Exadata Environments Common Restore Scenarios / Planning The criticality and workloads of typical Exadata databases makes
recovery strategies especially important: – Batch load / NOLOGGING operation went south – Long-term, periodic archival backups (keep forever / until) – Application patches and upgrades – Backing out a bad transaction
11
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Oracle Recovery Strategies Complementary and Integrated Technologies
Category Physical Data Protection
Logical Data Protection
Technology / Solution • •
Recovery Manager (RMAN) Oracle Secure Backup (OSB)
•
Data Guard or Active Data Guard
•
Flashback Technologies
•
Data Recovery Advisor (DRA) • Minimizes time for problem identification & recovery planning
Recovery Analysis
12
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Recovery Time Objective (RTO)
Recovery Point Objective (RPO)
Days/Hours
As of last backup
Minutes/Seconds
Current
Hours/Minutes
Minutes
Optimized
Optimized
Oracle Logical Data Protection Technologies Complements Physical Data Protection Strategy Flashback Technologies are a suite of logical error investigation and correction
capabilities built-in the Oracle database: – Error investigation: Flashback query, version query and transaction query – Error correction: Flashback database, table, drop and transaction
Flashback Database operates on physical data blocks and is similar in effect to
point-in-time recovery - other Flashback features operate at logical level – Only Flashback feature which must be explicitly enabled by user as it generates logs
In applicable scenarios, Flashback features are more efficient than media
recovery
Flashback Technologies Should be part of ALL Recovery Plans ! 13
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Restore Points What They Are and Why Use Them Restore point is a user-defined name assigned to an SCN or specific point in time
– a user-friendly “bookmark” FLASHBACK DATABASE TO RESTORE POINT 'before_upgrade';
There are two types of restore points – Normal and Guaranteed Guaranteed must be explicitly deleted by the user Normal age out of the control file For archival backups, use the PRESERVE key word to retain the restore point until backup expiration User-defined restore point names may be used as aliases for SCN with the
following supported commands: – RECOVER DATABASE and FLASHBACK DATABASE commands in RMAN – FLASHBACK TABLE in SQL
14
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Flashback Database VS Point-in-Time Recovery Different Approaches and Multiple Use Cases Flashback Database
Traditional Point-in-Time Recovery
Rewinds the database to SCN
Restores then recovers the database to SCN
•
Significantly faster than point-in-time recovery - No restore and only limited redo needed
Works at the database or tablespace level
•
Useful during database upgrades, application deployments, and efficient alternative to rebuilding a failed primary database after a Data Guard failover
•
Provides continuous data protection
•
Compatible with restore points
Advantages
Disadvantages
15
No additional logs necessary beyond redo Compatible with restore points
Requires Flashback logs and associated storage
Time consuming especially for larger databases
Works at whole database level only
Flashback logging has some (minimal) overhead on database server
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Database is down until fully recovered
Data Recovery Advisor (DRA) Reduces Downtime by Eliminating Confusion! Oracle Database tool that automatically diagnoses data failures, presents
repair options, and executes repairs at the user's request Determines failures based on symptoms – Failure Information recorded in diagnostic Automatic Diagnostic Repository (ADR) – Flags problems before user discovers them, via automated health monitoring
Intelligently determines recovery strategies – Aggregates failures for efficient recovery, presents only feasible recovery
options and indicates any data loss for each option Can automatically perform selected recovery steps Accessed via RMAN or EM
16
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
How Good is Your Backup Infrastructure? You Never Know – Unless Your Periodically Test It ! 1. Documented recovery plan for database and object level recovery 2. Perform periodic (i.e. regularly) recovery tests for various recovery scenarios: 1. Full database 2. Objects 3. Control file
3. Refresh test environments with RMAN 4. If hardware isn’t available to perform full database recovery tests, use RMAN RESTORE VALIDATE Job Security Tip # 1 – Successful recovery is all that matters! 17
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Architecting Your Backup Infrastructure
18
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
RMAN Traditional Backup Strategies Full Backup
Full / Incremental Schedule
Two types of RMAN full backups:
Backupset backups – Disk or tape
Image copy – Disk only Same size as the database less temp files
Typical schedule – Week full with daily incremental backups Typical retention:
Backupset – Disk or tape Smaller than image copy full
– Weeks to years – On tape
Can be compressed and/or encrypted by RMAN
– Full and corresponding incremental backup should be treated as a group
Full backup consumes more overhead on the production server and take more time than an incremental backup Restoration may be faster than an incremental
19
– Days to weeks – On disk
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
•
Reduces backup window and overhead on servers
•
Ideal with low-medium change rate e.g. <20%
•
Database must be in archived log mode
RMAN Incremental Forever Strategy Incrementally Updated Backups Oracle Database 10g Release 2 Enterprise Edition > Incremental forever after initial full image copy Full image copy is rolled forward on user-defined schedule • Roll-forward / merge does incur overhead on server • Offers SWITCH TO COPY capability Typical retention – One to seven days Backup full or incremental to tape
20
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Processing Offloaded From Database Nodes Incremental Backup Scans Occur on Exadata Storage Cells Block Change Tracking (BCT) enables fast incremental backups – RMAN tracks 32k data file sections which include a changed block(s) – During an incremental backup, RMAN scans these 32k file sections to
determine which block(s) have changed Only these changed blocks are included in the incremental backup
Scan of blocks occurs on the database server
Scan of blocks is offloaded to the Exadata Storage Cells
Database Server
Exadata Note: Incremental backup without Block Change Tracking (BCT) enabled – all database blocks are scanned to determine what has changed
21
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Backup of Compressed Data Effects on Sizing and Processes Compressed data remains compressed in the backup – This data will not benefit from further compression during the
backup (e.g. RMAN backup or tape drive compression) – Deduplication software cannot deduplicate compressed data
HCC Data OLTP Compressed Tables SecureFiles Compressed/Deduplicated
RMAN backup compression is effective on non-compressed database files Avoid using RMAN backup compression on HCC tablespaces by separating the
backups as shown below: Restore is no different than if the backups had not been separated
CONFIGURE EXCLUDE FOR TABLESPACE historical_data; CONFIGURE COMPRESSION ALGORITHM 'low’; BACKUP TABLESPACE historical_data; BACKUP AS COMPRESSED BACKUPSET database;
22
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Protecting Exadata Operating System Files
On the Exadata Storage Cells, the internal USB stick provides the
backup On the Exadata database nodes, backup the operating system(OS) files in the same manner as with any other database server Please refer to the documentation for more information: http://wd0338.oracle.com/archive/cd_ns/E13877_01/doc/doc.112/e13874/maintenance.htm#CHDIDGAI
23
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Exadata Backup Targets Considerations - Performance and Cost Trade-offs Highest Performance
Exadata Storage Expansion Rack
Exadata Storage
20 – 25 TB / hour
All Exadata smart features
27 TB / hour Fastest Backup and Restore ILM Historical Archive Second DATA2 Disk Group
High Performance and Added Flexibility
ZFS Storage Appliance (ZFS/SA) 13 TB / hour Backups of database & non-database files Snapshots Clones
Cost – Varies with hardware configuration Note*: Backup Rate limited by number of tape drives – 8 x T10000C Drives 24
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
StorageTek Tape Library 9 TB hour* Backup of database and non-database files Offsite Backups Vaulting
Oracle-Integrated Backup to Disk and/or Tape Multi-media Strategy: Disk-to-Disk-to-Tape (D2D2T)
Fast Recovery Area
D2D2T Exadata
BACKUP RECOVERY AREA;
Backup to Tape BACKUP BACKUPSET;
RMAN Disk Backup ZFS Storage Appliance (ZFSSA)
25
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
StorageTek Tape Library
• Fast Recovery Area should reside on Exadata storage – slower storage could degrade production database performance • Online redo, archived logs, Flashback logs, controlfile
Expanding Exadata Environments Connectivity Considerations
FRA
What happens when a 2nd Exadata is added?
FRA
What about a 3rd Exadata?
Exadata
The two Exadatas MUST be configured with different InfiniBand Subnets. The 3rd Exadata would be connected via 10Gigabit
FRA RMAN Disk Backup ZFS Storage Appliance (ZFSSA)
26
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Refer to the MAA white paper: http://www.oracle.com/technetwork/database/features/availab ility/maa-wp-dbm-zfs-backup-1593252.pdf InfiniBand
10Gigabit Ethernet
Customer Case Study – Fidelity Investments
27
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Oracle Open World
Exadata Backups
Harpreet Singh Vice President, Database Management Fidelity Investments September 24, 2013
Transition To Exadata – A Huge Success!
Challenges with traditional infrastructure • • • •
300TB of storage with over 60% annual growth rate Performance challenges Cost reduction pressures Need to make failover/recovery more robust
Benefits gained with Exadata • • • • • 29
42x performance gains for reporting & 40% for OLTP Reduced storage by 30% using compression Consolidated physical servers from 10 to 4 Reduced direct/indirect chargebacks by 30% Significantly improved failover, backup & recovery strategy
Exadata Architecture
30
Pre-Exadata Backup Challenges
Over 60% annual data growth rate
Business needs growing and becoming more complex
Backups hurting database performance
31
Expensive software/hardw are licenses
Complicated recovery with “no-logging”
Costly to keep backups on the disk
Concerns around nonlogical DR software
Fundamental Data Protection Strategy
32
1st Line of Defense
2nd Line of Defense
3rd Line of Defense
Last Line of Defense
• Flashback: 48 hours • data deletion • logical corruption • user errors
• Disk Backup: 24 Hours • application • system
• Standby Database (DR) • Building/site, region • HW failure
• Tape: 35 Days • Offsite • multi-site failures
Flashback
Disk Backup
Standby Database
Pros
Flashback •
Oracle Flashback Database
Faster recovery
•
Primary and Standby Sites
Data recovery from tables, schema, or entire database
Retention Period:
48 Hours
Restore Time:
< 1 Hour
Space Used:
300GB
Roll database back and forth repeatedly within the flashback window for complex data restore
Cons Same location as production – No protection from storage failure No protection from physical corruption
33
Tape Backup
Flashback
Disk Backup
Tape Backup
Pros
Disk Backup •
Exadata Fast Recovery Area
Protect against physical/logical database corruption
•
Incrementally Updated
Faster backup and restore
Retention Period:
24 Hours
Backup Rate:
1.2 TB/hour
Restore Rate:
1 TB/hour
Type:
RMAN Online Daily Normal Redundancy
34
Standby Database
Minimal overhead to the production database
Cons Shorter protection window (24 hours) Same location as production so no protection from DR or catastrophic storage failure
Flashback
Standby Database
Disk Backup
Standby Database
Pros Great for any data recovery when combined with Flashback Database
•
Data Guard
•
Asynchronous
•
No Delay Apply
Complete data protection if primary site is lost
•
48 Hour Flashback Database setup
Protection from physical corruption
•
700 miles between Primary and Standby sites
Can be turned into snapshot standby database temporarily and used for QA/Dev database refreshes through RMAN
Cons Resources (another set of servers/storage)
35
Tape Backup
Flashback
Disk Backup
Tape Backup
Tape Backup
Pros Longer term offsite retention than disk and standby
Retention Period:
35 Days (Offsite)
Channels:
2-4
Nodes:
1
Backup Rate:
1TB/hour (2 channels)
Restore Rate:
800GB/hour (2 channels)
Slower backup and restore than disk
RTO:
3 Days
Type:
RMAN
Media is less reliable
CommVault
36
Standby Database
Archived Redo Logs Retention
3 Days on disk
Archived Redo Logs Backup
Every 30 minutes
Media is relatively cheap
Cons
Planning a Comprehensive Backup Strategy
Determine disk backup strategy
Develop tape backup process
• Implement Oracle suggested RMAN backup strategy as it is great protection against data loss
Test different restore processes
• At least annually
Consolidate tape backup system 37
• Consider full backups once a week with daily incremental
• Should be centrally managed
Implementation Recommendations
Optimal performance • Configure Exadata backup over InfiniBand for better throughput • Configure number of channels based on database size and SLAs • Use one RMAN channel per tape drive for better throughput • Enable block change tracking for fast RMAN incremental backups
38
Data protection and disaster recovery • Backup Archived Log every 30 minutes for better data protection • Encrypt the data before writing to tape for data security • Set-up Flashback on both primary and standby databases • Utilize Data Guard broker
Monitoring • Use Oracle Enterprise Manager to monitor: • Disk backup • Tape backup • Data Guard • Flashback
Summary
Have clear and well communicated recovery SLAs Build your strategy around the business needs Revisit a well-documented, multi-level strategy periodically Be conservative and prepare for the worst Test Practice 39
The New Modern Cloud Paradigm
40
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Oracle Database Backup Logging Recovery Appliance Announced at Oracle OpenWorld 2013
Please refer to Oracle.com for additional information: http://www.oracle.com/us/corporate/features/database-backup-loggingrecovery-appliance/index.html
41
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Summary and Q&A
42
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Oracle Technologies Mitigate Downtime Complexities Are Inherent in IT – Know IT and PLAN for IT! Validated, reliable backup you know can be recovered
Oracle Engineered Solutions eliminate interoperability, patching and upgrade risks
System Monitoring
Oracle Technologies
Flashback Technologies
RMAN
Quickly review and/or correct user errors
43
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Active Data Guard
Policy-based, data protection management
Oracle Secure Backup
Enterprise Manager
Failover, fallback and/or disaster recovery
Key Takeaways Exadata Backup and Recovery RMAN backup / recovery on Exadata is the same as other
platforms – just faster! Oracle data protection technologies meet diverse RTO / RPO and budget requirements Database consolidation and data protection is ideally suited to the Exadata platform
Who Better to Backup Oracle Than Oracle? 44
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Resources OTN HA Portal:
http://www.oracle.com/goto/availability Maximum Availability Architecture (MAA):
http://www.oracle.com/goto/maa MAA Blogs:
http://blogs.oracle.com/maa Exadata on OTN:
http://www.oracle.com/technetwork/database/exadata/index.html Oracle HA Customer Success Stories on OTN:
http://www.oracle.com/technetwork/database/features/ha-casestudies098033.html
45
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
46
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.