DMM103 New and Best Practices for for Data Data Modeling with SAP HANA –
Christoph Morgen / SAP HANA Product Management, SAP SE
Disclaimer This presentation outlines our general our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any or any other agreement with SAP. SAP has no obligation to pursue any course of business of business outlined in this presentation or to or to develop or release or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any for any reason without notice. This document is provided without a warranty of any of any kind, either express or implied, or implied, including but not limited to, the implied warranties of merchantability, of merchantability, fitness for a for a particular purpose, particular purpose, or non-infringement. or non-infringement. SAP assumes no responsibility for errors for errors or omissions or omissions in this document, except if such if such damages were caused by SAP intentionally or grossly or grossly negligent.
Agenda Data
Modeling in SAP HANA New Approaches New Approaches Best Practices Future Look
Data Modeling with SAP HANA Overview SAP HANA Information Views
SAP HANA Information Views Virtual Data Modeling approach as a key SAP HANA concept
Virtual Data Flow Models
Operational Reporting | Applications | Analytics
SAP HANA PLATFORM HANA Views •
Physical Tables Database Layer
• • •
No Aggregatio No Aggregations ns | single atomic copy of detail of detail ata In-Memory Engines | Performance Multidimensional Reporting Models Enterprise Application Enterprise Application Virtualize Data Models
SAP HANA Information Views Virtual Data Models for Multidimensional Scenarios Reporting Tools can usually directly consume HANA Calculation Views or Analytic Views. Multidimensional Tools support Hierarchies support Hierarchies for Navigation, Filtering and Aggregation and HANA Prompts (Variables & Input Parameters) for efficient Pre-Filtering of Data.
Calculation View
Calculation Views are usually build upon Analytic-, Attribute Analytic-, AttributeViews, and Column Tables
Analytical View
Attribute View
Column Table
SAP HANA Information Views Virtual Data Models for Normalized Data Model Scenarios
SAP HANA Calculation Views provide the means to model sophisticate views based on normalized data structures. Complex Calculation Views demand a more explicit intent and control of the modeled set-based data flow, i.e. slicing, aggregation and filtering of sets as input to joins, unions etc.
See SAP Note 1857202
SAP HANA Calculation Views typically feed data to Business Applications, like SAP HANA XS build Applications
SAP HANA Information Views Flavors of View Modeling Approaches
Attribute Views Analytic Views Calculation View • Compose a dimensional view • Combines Fact-Tables with • Great flexibility for advanced for advanced use Attribute-Views to Star-Schema- • Approach to model custom with a series of attributes of attributes derived Attribute-Views from a collection of tables of tables or OLAP or OLAP Cube-like objects for scenarios like e.g. Master Data Views multidimensional reporting. Combined use of Multiple-Fact of Multiple-Fact Highly re-used and shared in • Stores no aggregates and mass- Tables/Analytics Views Analytic- and Calculation Views aggregates on the fly Build Models on Normalized Data Used to build Hierarchies • Hierarchies are key for multifor multi Re-Use and stack views Hierarchies are key elements in dimensional access (navigation, Make use of custom of custom scripted views use with Analytic with Analytic View for multifor multifiltering, slicing and aggregation)
SAP HANA Attribute Views Modeling Attribute Information in Data Models What is an Attribute View? • ... are the reusable dimensions objects adding context to data in the analysis or data or data flow.
Semantic Attribute Semantic Attribute Information
• Can be regarded as Master Data-Views
Build a semantic attribute information collection from various source tables (e.g. join ‘Plant’ to ‘Material’) Measures cannot be modeled Re-used as dimensions in multidimensional scenarios (Analytic Views) or re-used to model complex master in master in normalized data model approach and master data master data reporting scenarios Cannot be directly consumed by multidimensional reporting clients
Attribute Data Foundation
SAP HANA Attribute Views Modeling Attribute Information in Data Models What are the capabilities of of Attribute Attribute Views? • Attribute Views support Foundation join model (various joins (various joins types) Data Foundation join Calculated Attributes (static or dynamic or dynamic calculations) Calculated Attributes master data tables Description Mapping & Text-Join Lookup master data e.g. used for handling for handling of multi-language of multi-language master data master data Dimension Attribute Views Time Dimension Attribute
Sematic-Type specification Hierarchies (Level, Parent-Child), hierarchy-use behavior filter values on attributes & columns Define filter values incl. use of Input of Input Parameters Embedded search properties
SAP HANA Analytic Views Modeling Facts and Dimensions as a Multidimensional Data Model What is an Analytic View? • Can be regarded as Cube-/OLAP Star Schema-like data model Without storing aggregated data consumed using multi-dimensional clients or
1
re-used in complex data flows
Fact Table
N
• Fact data from the ‘Data Foundation’ is joined is joined
against modeled Dimensions (Attribute Views) fact table contain the key figures
1
Left Outer N
1
N N
‘Measures’
Dimensions describe the key figures and enrich the data star schemas is generally N:1 fact to dimension, Left Outer Joins Outer Joins Cardinality in star schemas complex join operations OLAP models are not designed to handle complex join
1
Analytical View
HANA Analytical Views are highly optimized for for aggregating aggregating mass data
N
1 © SAP AG SAP AG 2011
11
SAP HANA Analytic Views Modeling Facts and Dimensions as a Multidimensional Data Model What are the capabilities of of Analytic Analytic Views? • Data Foundation Model for Star for Star Schema Fact data Measure facts derive from only one data foundation table • Measure definition from Data Foundation Facts Distinct Count Measures, Calculated & Restricted Measures, Currency- and Unit Conversion-Measures Default Aggregation (sum, max, count, ...) Default Aggregation • Re-use of slowly of slowly changing dimension data Temporal join btw Facts and dimension data Temporal join • Variables and Input Parameters Dynamic filtering and parameter-driven calculations UI-prompts in use with multidimensional reporting clients
SAP HANA Calculation Views Modeling custom Data Flow Scenarios What is a Calculation View? • Calculation Views are composite views and can be used to combine other views Analytical-, Attribute-, Attribute-, other Calculation other Calculation Views & tables Can consume other Analytical-, • Approach to model custom scenarios like of Multiple-Fact Tables/Analytics Views Combined use of Multiple-Fact Build Models on Normalized Data Re-Use and stack views of custom scripted views Make use of custom • Great flexibility for advanced for advanced use • Modeling Approaches: Approaches: Graphical Modeler or
SQLScript-based Editor
Calculation View
Application UI
SAP HANA Calculation Views Modeling custom Data Flow Scenarios What are the capabilities of of graphical graphical modeled Calculation Views? • Graphical Calculation Views Views as client-consumable multidimensional query objects or
Calculation View
internal re-use objects or views or views without measures Builds a composite view with facts from multiple sources Embeds Analytical-, Attribute-, Attribute-, Calculation Views & tables Embeds Analytical-, for Union, Join, Projection & Aggregation Data operation nodes for Union, (No SQL or SQL or SQL Script knowledge required) Great flexibility to build complex Virtual Data Models* and unique capabilities like dynamic joins, dynamic joins, …. • Calculation View-data flows
Will get optimized and exploit underlying engines (i.e. OLAP Engine) where possible, prune data flow, push-down filters … *SAP HANA Live
Attribute View
Analytical Analytical View
Calculation View
Column Table
SAP HANA Calculation Views Scripting custom Data Flow Scenarios SQL Scripts-based Calculation Views • Use of custom of custom SQLScript coding to build Calculation Views or standard SQL-statements (do not mix) Utilize SQLScript CE-functions or standard Side-effect free structures / READ-ONLY functions Consume data from raw tables, modeled views, stored procedures, decision tables, … Semantic output structure is graphically modeled • Graphical modeled View versus CE Functions
Result in equal performance (e.g. field pruning, parallelization, join omission, ….)
Scripted
New Approaches SAP HANA Information Models
New approaches for for SAP SAP HANA Data Modeling
Value Help Views
–
Variable and Input Parameter
Referencing data structures for value help definitions
Variable and Input Parameter support Parameter support “external” views or tables or tables to generate value help lists-of-values – Supported with Analyticwith Analytic- and Calculation Views
View Variables and Input Parameter can Parameter can be mapped to variables and input parameters from external views or tables or tables – Allows filtering and customizing value help LOVs from external views
e.g. by passing to values of initially of initially selected variables, a LOV for a for a dependent variable is generated as a dynamically filtered LOV
Out-source Value Help Information to dedicated views
Benefit from faster faster value value help dialogs
Provide consistent LOVs across consuming views
*was introduced with a revision previous to SPS07
New approaches for for SAP SAP HANA Data Modeling
Modeling Dimension-Calculation Views Dimension-Type Calculation View
Similar capabilities Similar capabilities to Attribute to Attribute Views – cannot be directly consumed by multidimensional reporting clients
Composing Dimension-Attributes and -Hierarchies based on a custom CalcView-data flow Hierarchies supporting dynamic input structures – Attribute columns used within hierarchies can ne input parameter-driven – Hierarchy properties like Root Node (e.g. Parent-Child Hierarchies) can be
input parameter-driven
Usage Scenario
Dynamic hierarchy structures and properties are required Star-Join Calculation Views (details see following slides) – Dynamically mapped attributes
New Practices for for SAP SAP HANA Data Modeling
Modeling Star Star Join-Calculation Join-Calculation Views Calculation Views with Star Join Star Join capabilities
Joining fact input data flows with multiple dimension views in a single node – Fact Input flow can be any custom calculation flow of nodes of nodes – Other, only Calculation Views of Data of Data Category-Dimension are allowed as input
Dimension Properties – All DIM-View Attributes DIM-View Attributes are automatically part of the of the StarJoin (incl. Hierarchies) – DIM-Views are added as shared (referenced) dimensions, changes to dimension
views are immediately available – Local hierarchies can also be defined
Measures, like Counters use the dimension reference, hence can reference to attributes hidden from the output
UseCase
Make use of of special special Calculation View Dimension capabilities (e.g. parameterized Hierarchies) or or multiple multiple fact-table input
Star Join-Views currently cannot regarded a replacement for Analytic Analytic View Important Note: Star Join-Views capability, especially as they have not been yet fully optimized for aggregation for aggregation performance. Further, Star Join-Views Star Join-Views cannot be consumed by other Calculation other Calculation Views.
New approaches for for SAP SAP HANA Data Modeling
Modeling with Virtual Tables Consuming remote data in SAP HANA Views
Smart Data Access Data Access allows – To access remote data like “local” data – Specific HANA query optimization and execution handles functional SQL compensation, automatic data type translations, filter push-down, filter push-down, … in
order to order to push query processing to the underlying data source system
To enable consumption – Remote data structures are registered and referenced as virtual tables – Virtual tables can also be consumed as data sources within HANA
Calculation Views.
Supported external systems and restrictions are documented in SAP note 1868209
New Practices for for SAP SAP HANA Data Modeling
Introducing performance analysis capabilities Performance Analysis Mode in Modeling Environment
Introduction of performance of performance analysis hints and indicators inside the HANA Model Editor or defaulted switched on – Manually switched on or defaulted – Hints and indicators about table partitioning and number of number of rows rows (threshold as preference)
Scenario indicators for partitioned tables (icon) and exceeded row thresholds
Switching on performance analysis mode
View details pane: indication about partitioning type by icon (hash, range, …) Performance analysis: more partitioning and row information.
New Practices for for SAP SAP HANA Data Modeling
for detailed detailed analysis Enhanced Plan Visualization for SQL Editor > Editor > Context Menu
Explain Plan
Time Line View – root cause analysis New Operator List Operator List View Summarized & Filter Visualize Plan
New Practices for for SAP SAP HANA Data Modeling
Modeling Productivity
–
Error Handling/HANA Error Handling/HANA Answers Integration
Extended Error Handling Error Handling with SAP HANA Answers (http://answers.saphana.com/)
In extension to documentation and help, the SAP HANA Answers-plugin HANA Answers-plugin to SAP HANA Studio enables crawling external sources of information of information e.g. adds information from SCN and others Displays crowed-sourced information – embedded in HANA Studio or outside or outside – Integrated with HANA Studio views (job log, …),
editors, wizards. Called via key from selected text or feature. or feature.
Independent feature-plugin to install F10 as your new friend
Best Practices SAP HANA Information Models
SAP HANA Modeling Best Practices
General Modeling Performance Guidelines • Avoid transfer ring transfer ring large result sets between the HANA
Database and client application transfer between views – follow the volcano • Reduce data transfer between approach Procedures
Calculation View
after aggregation & avoid calculations before • Do calculation after aggregation aggregation on line item level
Analytical View
Attribute View
BY, Keep flags, • Aggregate data records (e.g using GROUP BY, reducing columns, ….) Filter data amount as early as possible in the lower layers .. • Filter data
Analytical Privileges
use – Constraint filters
Column Store Tables
– WHERE clause / Parameters – Analytical Privileges
SAP HANA Modeling Best Practices
Modeling Scenario
–
Multidimensional Model or or not?! not?!
Analytical View (including Attribute (including Attribute views)
Calculation View STAR SCHEMA MODEL AGGREGATION AGGREGATION WITH OR WITHOUT WHERE CLAUSES
Multidimensional, Star Schema Star Schema Attribute View (independent)
Combine multiple Analytical multiple Analytical Models using Unions (Similar Dimensions) (Similar Dimensions) Calculation View
MULTIPLE QUERY OJECTS NORMALZED MODEL COMPLEX JOINS WITH WHERE CLAUSES
Complex Joins
Complex Joins including facts
SAP HANA Modeling Best Practices
Things to watch our our for for
… –
Data Movement
Select…Where Matnr Matnr = = DPC1017 Group By Spart
Refrain from moving large datasets between views & to the front-end
Filters pushed down
Aggregation
SELECT SPART, MATNR, WERKS, SUM(KWMENG) FROM A FROM AV V WHERE WERKS = ? AND MATNR = ? GROUP BY…
While re-using Analytic re-using Analytic Views in Calculation Views, carefully select, prune, filter and slice data from a wide and multidimensional detaillevel data structure
SAP HANA Modeling Best Practices
Things to watch our our for for
… –
Implicit Filter Push Down
SELECT WERKS, SUM(KWMENGA), SUM(KWMENGB), SUM(TOTAL) FROM CV WHERE MATNR = DPC1017 GROUP BY WERKS
Review the Visualize Plan to make sure filters are pushed down Filter applied late
Select Werks, Matnr, Sum(KwMeng) From Sales_Av
Filter applied early
Select … From Sales_Av Where Werks = 100 & Matnr Matnr = = DPC1017
Select … From Sales_Av Where Werks = 1000 & Matnr Matnr = = DPC1017
SAP HANA Modeling Best Practices
Things to watch our our for for
Review the Explain Plan! Limit the number of number of records records used by ROW store operators (through Aggregation & Filters)
… –
Column & Row Store operators
SAP HANA Modeling Best Practices
Things to watch our our for for
… –
Scripted Calculation Views
SELECT FIELDA FIELDA,, SUM(COUNT SUM(COUNT)) FROM MODEL GROUP BY FIELDA WHERE FIELDA IN Apple, IN Apple, Orange, Banana
SQL is static and is always executed as defined and intermediate result sets will be materialized FIELDA
COUNT
Apple
3000000
Banana
4000000
Orange
9000000
Filter applied Filter applied & columns pruned late
CE Functions will try to exploit underlying database engines and will push filters down, prune columns and omit joins omit joins where possible FIELDA
COUNT
Apple
3000000
Banana
4000000
Orange
9000000
Filters applied & columns pruned early. Aggregation performed by OLAP engine
SAP HANA Modeling Best Practices
Tips and Tricks
… –
Input Parameters (Explicit filter push down)
SELECT CATEGORY, SUM(YEAR1), SUM(YEAR2) FROM MODEL GROUP BY CATEGORY WITH PARAMETERS ( 'PLACEHOLDER' = ('$$YEAR_1$$', '2011'), 'PLACEHOLDER' = ('$$YEAR_2$$', '2012'))
Define Input Parameters in the
Data Foundation to explicit and compulsory Filter the Filter the data
Constraint Filter $$YEAR_1$$
Constraint Filter $$YEAR_2$$
SAP HANA Modeling Best Practices
Tips and Tricks
… –
Input Parameters (Explicit filter push down)
SELECT CATEGORY, SUM(YEAR1), SUM(YEAR2) FROM MODEL GROUP BY CATEGORY WITH PARAMETERS ( 'PLACEHOLDER' = ('$$YEAR_1$$', '2011'), 'PLACEHOLDER' = ('$$YEAR_2$$', '2012'))
Define Input Parameters in the
Data Foundation to explicit and compulsory Filter the Filter the data
Constraint Filter $$YEAR_1$$
Constraint Filter $$YEAR_2$$
SAP HANA Modeling Best Practices
Tips and Tricks
… –
Union with Constants for for Input Input Source Pruning
SELECT ORDER SALESORG DIVISION AMOUNT QUANTITY GROUP_CONSTANT
Constant Column
FROM VIEW WHERE (GROUP_CONSTANT = A AND ORDER IN 1, 2,3) OR (GROUP_CONSTANT = B AND ORDER IN 6,7,8)
Filtering on Constant Column will prune input sources. (i.e. Model C, D, E, F are NOT executed)
SAP HANA Modeling Best Practices
Tips and Tricks
… –
Union with Constant Values (Pivot Data)
Standard Union
Pivot table using Union with Constant values
SAP HANA Modeling Best Practices
Tips and Tricks
… –
Join Analytic Models with caution. • Use WHERE clauses/filters to minimize the amount of records used in the join. the join. • Slicing / pruning of columns of columns •…
if joining joining Analytics Models is required Variables or Input or Input Parameters for explicit for explicit filtering exposed as UI prompts
Optimized join Optimized join Where clause(s) filters data-set before Join occurs
Variable (Created On On)) pushed down using WHERE clause
Variable (Delivery Date Date)) pushed down using WHERE clauses
SAP HANA Modeling Best Practices
Modeling Scenario
–
Normalized Data Models / Virtual Data Models
Building complex relational Model Scenarios
Identify reproducible pattern
Split big models into smaller parts smaller parts
Do not build monolithic models
of the modeled set• Demands a more explicit intent and control of the based data flow, i.e. slicing, aggregation and filtering of sets of sets as input to joins, to joins, unions etc.
SAP HANA Modeling Best Practices
SAP HANA Live!
–
Virtual Data Models for for SAP SAP Business Suite
Open Interfaces https | oData | SQL | MDX
• Completely build on
SAP HANA Calculation Views SAPdelivered Query Views
Personalized Views
Personalized Views
Personalized Views
Customer Extensions Customer Extensions Reuse Views Private Views Physical Tables
SAP HANA Live (VDM)
Query Views – Top of the of the SAP HANA Live view
hierarchy (VDM) and provide consumable output fields – Query Views are consumed by reports or analytical or analytical applications
Reuse Views Re-use of Real-Time Views
– are for use for use in Query Views, not for
direct consumption by reports or analytical applications – Reuse Views represent the actual data model by exposing or “translating” original SAP Business Suite source tables into Views
Private Views – encapsulate SQL transformations on
single or multiple or multiple data base tables or Reuse views – Private Views are comparable to subroutines
SAP HANA Modeling Best Practices
Modeling Scenario
–
Normalized Data Models / Virtual Data Models
Optimized execution for for Queries Queries in complex relational Model Scenarios For stacked For stacked Calculation View data models, the Calculation Engine can generate an optimized SQL statement at runtime if SQL Engine execution-flag is used
for considerations and constraints • See SAP Note 1857202 for considerations – i.e. Only Tables and Calculation views are supported … etc. – Improved Join ordering
• Note, the optimized execution of SQL of SQL
Engine- flagged models does allow for implicit OLAP Engine push-down in certain scenarios.
Outlook SAP HANA Information Models
Future Directions for for SAP SAP HANA Data Modeling
Planned Innovations and Future Direction Planned Innovations
Editor usability Editor usability Value help entity reference Calculation Views
Future Direction
analysis, join analysis, join cardinality – Logging and tracing for modeler for modeler plugins (preferences)
– Table function as data sources
Script-based CV enhancements – Table function support
History Views support (time travel support) Spatial support: spatial joins, spatial expressions in filters Harmonize Development- and Modeler- Studio perspectives
Productivity / object re-usability
Further unification Further unification of Attribute, Attribute, Analytics, Calculation View into a unified model Consumption framework for Application Function Library Procedures Continuous Usability enhancements
– Replace node / replace node with a
– Output column structure import
– Debugging Views with drill-down
– Rank node
Harmonize object naming Supportability
data source / .. – Propagate semantics from data sources / extract semantics
WebIDE Calculation View editor – Analytic Privilege editor –
Drive easiness, enhance modeler productivity, reduce complexity, separate modeling intent from model optimization
Further Information
SAP Public Web scn.sap.com http://scn.sap.com/community/developer-center/hana www.sap.com www.saphana.com
SAP Education and Certification Opportunities www.sap.com/education
Watch SAP TechEd Online www.sapteched.com/online
Related Workshops/Lectures at SAP TechEd 2014 DMM161 - Introduction to Data Modeling in SAP HANA, Hands-On Workshop DMM270 Advanced Data Modeling in SAP HANA, Hands-On Workshop –
Feedback Please complete your session evaluation for
DMM103.
Thanks for for attending attending this d-code session.
SAP d-code Virtual Hands-on Workshops and SAP d-code Online Continue your your SAP SAP d-code education after after the the event! SAP d-code Virtual Hands-on Workshops Access hands-on workshops post-event Starting January 2015 Complementary with your SAP d-code registration http://sapdcodehandson.sap.com
SAP d-code Online Access replays of keynotes, of keynotes, Demo Jam, SAP d-code live interviews, select lecture sessions, and more! Hands-on replays http://sapdcode.com/online
Thank you Contact information: Christoph Morgen SAP HANA Product Management SAP SE | Dietmar-Hopp-Allee 16 | 69190 Walldorf | Walldorf | Germany
[email protected] | www.sap.com
© 2014 © 2014 SAP SE or or an an SAP affiliate company. All rights reserved. No part of this of this publication may be reproduced or transmitted or transmitted in any form or for or for any any purpose without the express permission of SAP SE or an or an SAP affiliate company. SAP and other SAP other SAP products and services mentioned herein as well as their respective their respective logos are trademarks or registered or registered trademarks of SAP of SAP SE (or an (or an SAP affiliate company) in Germany and other countries. other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx http://global12.sap.com/corporate-en/legal/copyright/index.epxfor for additional additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other of other software software vendors. National product specifications may vary. These materials are provided by SAP SE or an or an SAP affiliate company for informational for informational purposes only, without representation or warranty of any of any kind, and SAP SE or its or its affiliated companies shall not be liable for errors for errors or omissions or omissions with respect to the materials. The only warranties for SAP for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its or its affiliated companies have no obligation to pursue any course of business of business outlined in this document or any or any related presentation, or to or to develop or release any functionality mentioned therein. This document, or any or any related presentation, and SAP SE’s or its or its affiliated companies’ strategy and possible future developments, products, and/or platform and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal or legal obligation to deliver any deliver any material, code, or functionality. or functionality. All All forwardlooking statements are subject to various risks and uncertainties that could cause actual results to differ materially differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their of their dates, dates, and they should not be relied upon in making purchasing decisions.
Source: http://www.saphanacentral.com/