Testing of a Reporting Application Data Centric Testing Approach By Puneet Maheshwari
Topics • Data Warehouse • Data Flow • BI • Report types • DW BI Testing • Report Testing • ETL Testing
Data Warehouse • Data warehouse is a repository of an organization's electronically stored data. • DW contains Subject oriented, Non volatile, Integrated and Time variant Data. • Data warehouses are designed to facilitate reporting and analysis. • Data mart is a miniature data warehouse designed to meet Reporting needs of specific department of Business Function.
A generalised DW System
Generalized Data Flow Reports ETL
csv
DW
ETL ETL
D B
DM
ETL ETL Source
Staging
Cub e
Data Warehouse Data model • DW Database is modeled on Star Schema or Snow flaked Schema. • Schema contain Fact and Dimension Tables. • Dimension Table contain the attributes of measures or facts. This table contain primary (or surrogate keys). • Fact Tables contain Numeric measures and foreign keys to dimension Tables.
BI Tool Structure
SQL
Request Semantic Layer
Tables
Data
Reports Formatted Data
Semantic layer – This is a layer through which a user accesses database. – Hides the complexity of underlying data sources. – Allows data access using common business terms. – Creates Complex and efficient SQL statements. – It contains Metadata, Tables & Columns information, Joins, Cardinality, Database connection, Users … – Semantic layer is provided by BI tool.
Types of Reports • Static Reports: – Data is displayed in form of tables or charts. – Data – The – The data can be summarized or detailed. – These – These reports have fixed layout and precisely defined functionality. – User – User can only select filter values and prompt values.
• Dash Boards: – Single interface of many data. – Single – Data – Data is generally displayed as graphical manner like pie chart, bar graph, dials etc. – Dash – Dash boards are also used to display KPI of the system.
Types of Reports • KPI/Score Card/Metric Reports: – KPI- Key performance indicators are High level aggregations of data existing in the system. with various various colors based on – These reports show the KPI/Metric values with performance. e.g.: Heat chart report.
• Adhoc reports: – These reports are made on the fly by users. – These reports are made to analyze scenarios not covered in static reports. – Reports are made in Query Studio. – Cognos packages are exposed for users in query studio.
Data Warehouse and BI Testing • Grey Box testing (Black Box + White Box) • Focused on correct data flow from source till final reports. • Need of Data Maps between each data holding stage. • Data validation between each data holding step. • SQL Queries as Testing Tool.
BI/Report Testing
BI/Report Testing • Format tests: Header , Footer, Look and feel, Fonts, Static Prompts & Dynamic Prompts, Conditional formatting, Paging. • Functional Tests: Checking all Functionalities of report, Checking filters and their options, Requirement conformance .
BI/Report Testing • Data Validation: Most critical test in BI testing.
Fixed Reports: Generate the Report. SQL query is made as per functionality of report and fired at each data step. (For cross tab reports if SQL gets very complicated, report can be broken from one dimension into many sub reports.) Results of SQL query and Report are compared at record level. If result set is huge sampling can be performed.
BI/Report Testing Dash Bords/KPIs Generate the report. As per KPI Definition SQL Query is made and fired at database. KPI values generated by SQL and Report are compared. Color displays for Matrices are are checked with standards. standards. Adhoc Reports (Cognos Model Testing): Static check of links between Query objects and Query Items with database tables and columns respectively. Static Check of joins created between query objects. Static checks are performed at report studio. Checking calculations and aggregations by running adhoc reports and corresponding SQL Query in Database.
BI/Report Testing • Performance Testing
Report Generation and Refresh duration. • Authentication and Security Testing
Only authorized users should get access to reports. Scheduled reports should be delivered to correct inboxes.
Pre Requisites for Report testing
• Access rights. • Report requirements i.e. Functionality. • Data models of reporting database. • KPI and Metric definitions.
• Mapping between reports and database tables. • Knowledge of SQL.
ETL Testing
ETL Testing • Data Completeness – To Test all jobs have run successfully and data transfer is complete. – Counts between two data holding stages should be equal. – Aggregations on numeric columns should be equal.
ETL Testing • Data Correctness – Also called as Transformation Testing – Validation of business rules. – Business rules are documented as mapping documents. – A sample Mapping Source
Target
DB
Table
Coulmn
DataType
DB
Table
Coulmn
DataType
Transformation
Scr
TblX
A
int
Tgt
TblX
P
int
A + 100
Scr
TblX
B
CHAR
Tgt
TblX
Q
Date
Date(B)
Scr
TblY
C
int
Tgt
TblX
R
i nt
Sum( C )
TblX
S
i t
D f
lt 100
ETL Testing • Data Correctness Testing – Attribute level transformation validation process. • A select SQL Query, containing key columns + attribute columns, is fired at target table. • ETL is executed. • A SQL query , containing source columns + transformations(source column), is fired on source table. • The results are compared. • This comparison is performed for small subset of data. • All attributes should be validated in one or more queries.
ETL Testing • Data Correctness Testing – Each Types of transformations needs separate validation. • Aggregations • Direct Maps • Defaults • Lookups • Calculations
– DFD,LDM &ERD are required to formulate filter conditions and joins in the SQL query.
ETL Testing • Data Validation – Row to Row comparison between two consecutive data holding stages. – A small set of test data is prepared and ETL is executed to load it. – SQL query are used to validate complete data at each stage.
ETL Testing • Environment Integration – Testing ETL execution on various OS , DB etc. – Testing Logging mechanism of ETLs. • Execution Logs • Error Logs • Data Rejection logs
– Testing ETL on schedulers.
ETL Testing • Performance testing – Testing ETL Execution time. • Known amount of data is processed by ETL. • Execution time is calculated from Start and End Time of ETL. • From system volumetric , Execution time in real time is interpolated. • If execution time is more than permissible value, ETL should be tuned.
– Scalability testing
Pre Requisites for ETL Testing • Mappings between each stage. • Data Flow Diagram, Logical data model and Physical data models. • Database access rights and ETL execution rights. • Knowledge of SQL, Operating system commands and ETL Tool used.
Other DW Tests • System Reconciliations. • Comparison of DW Reports with other reporting applications. • Automation of ETL Testing. • Testing of Landing server files.