Identifying duplicate records by using expression transformation
1 Objective To segregate distinct and duplicate records in a source, solution described below uses an expression transformation. This solution exploits the concept that all input ports are evaluated first, then variable ports are evaluated and all output ports are evaluated in last.
2 Source definition Source is EMPLOYEE table as shown in figure 1.
Figure 1: Relational source EMPLOYEE
3 Target definition There are two targets in sample mapping. All duplicate records found in source table will be inserted into target table T_DUPLICATE_EMP and distinct records will be inserted into target table T_DISTINCT_EMP.
4 Mapping Figure 2 shows the mapping designed to identify duplicate and distinct records.
Figure 2: Mapping design 4.1 Transformations used SRT_SOURCE_RECORDS Sorter transformation is used after Source Qualifier to sort source records. All ports are selected as key.
Identifying duplicate records by using expression transformation
Figure 3: Sorter transformation
EXP_FLAG_DUPLICATE_DISTINCT Expression transformation is used to flag a record as either duplicate or distinct. Current record (v_CurrentRec) is compared with previous record (v_PrevRec) and if records match v_IsDuplicate variable is set to ‘Y’ else it is set to ‘N’.
Figure 4: Expression transformation
Table 1 shows expression used for variable and output ports.
Identifying duplicate records by using expression transformation
Disclaimer and Liability notice Informatica offers no guarantees and assumes no responsibility or liability of any t ype with respect to the content of t his software asset, including any liability resulting from incompatibility between the content within this asset and the materials and services offered by Informatica. You agree that you will not hold, or seek to hold, Informatica responsible or liable with respect to the content of t his software asset.