The Joiner
Transformation is active and connected transformation.
The Joiner
transformation to join source data from two related heterogeneous sources
residing in different locations or file systems. We can also join data from the
same source. The use Joiner transformation sources should have at least one
matching column. The Joiner transformation uses a condition that matches one or
more pairs of columns between the two sources.
The two input pipelines
include a master pipeline and a detail pipeline or a master and a detail
branch. The master pipeline ends at the Joiner transformation, while the detail
pipeline continues to the target.
No. Of Joiner = No Of
source (N) - 1.
The following
limitations on the pipelines you connect to the Joiner transformation:
- You cannot use a Joiner transformation when either
input pipeline contains an Update Strategy transformation.
- You cannot use a Joiner transformation if you connect a
Sequence Generator transformation directly before the Joiner
transformation.
Join Type:
The join is a relational operator that combines data from multiple
tables into a single result set. The Joiner transformation is similar to an SQL
join except that data can originate from different types of sources systems or
databases.
We can define the join type on the Properties tab in the
transformation. The Joiner transformation supports the following types of
joins:
- Normal
- Master Outer
- Detail Outer
- Full Outer
Note: A normal or master outer join performs
faster than a full outer or detail outer join.
If a result set includes fields that do not contain data in either
of the sources, the Joiner transformation populates the empty fields with null
values. If you know that a field will return a NULL and you do not want to
insert NULLs in the target, you can set a default value on the Ports tab for
the corresponding port.
Normal Join
With a normal join, the
Integration Service discards all rows of data from the master and detail source
that do not match, based on the condition.
Master Outer Join
A master outer join will
keeps all rows of data from the detail source and the matching rows from the
master source. It discards the unmatched rows from the master source.
Detail Outer Join
A detail outer join keeps all rows of data from
the master source and the matching rows from the detail source. It discards the
unmatched rows from the detail source.
Full Outer Join
Using Sorted Input
We can improve session
performance by configuring the Joiner transformation to use sorted input. When
you configure the Joiner transformation to use sorted data, the Integration
Service improves performance by minimizing disk input and output. If the sorted
input option is select then we are telling Informatica integration service that
we have passing sorted data to the joiner transformation.
The following tasks to
configure the mapping:
- Configure the sort order. Configure the sort order of the data you want to join. You can join sorted flat files, or you can sort relational data using a Source Qualifier transformation. You can also use a Sorter transformation.
- Add transformations. Use transformations that maintain the order of the sorted data.
- Configure the Joiner transformation. Configure the Joiner transformation to use sorted data and configure the join condition to use the sort origin ports. The sort origin represents the source of the sorted data.
- If you have selected sorted input option and passed un-sorted data to the transformation then the fail will failed.
Cache:
When the Integration
Service processes a Joiner transformation, it reads rows from both sources
concurrently and builds the index and data cache based on the master rows. The
Integration Service then performs the join based on the detail source data and
the cache data.
The number of rows the
Integration Service stores in the cache depends on the partitioning scheme, the
source data, and whether you configure the Joiner transformation for sorted
input.
To improve performance
for an unsorted Joiner transformation, use the source with fewer rows as the
master source. To improve performance for a sorted Joiner transformation, use
the source with fewer duplicate key values as the master.
Sample Mapping:
1. Connect and Open the
folder if not already opened.
2. Select Tools -->
Mapping Designer
3. Select Mappings
--> Create
It will pop-up
"Mapping Name". Enter the mapping name of your choice" and Click
on 'OK'. Example: m_emp_dept_joiner
4. Drag the Source and
Target definitions into workspace if they are already exist. If not click hereto know how to create or import Table definitions.
5. Select
'Transformation' from Menu --> Create
a) That will appear you
'Select the transformation type to create:'
b) Select 'Joiner' from
drop down and 'Enter a new name for this transformation:' as "
jnr_emp_dept"
c)
Click 'Create' and 'Done'
or
a) Click on Joiner
Transformation icon marked below in below snapshot.
b) Click in the
workspace in Mapping Designer.
c) Select JNRTRANS in
workspace and Right Click --> Edit.
d) In Transformation tab
--> Click on 'Rename' highlighted above which will pop-up 'Rename
Transformation'. Enter the Transformation Name: "jnr_emp_dept"
e) Click on 'OK'
f)
Click on 'Apply' and 'OK'.
6). Drag required ports
from 'SQ_EMP' and 'SQ_DEPT' Source Qualifiers to 'jnr_emp_dept' Joiner
Transformation.
7) Select 'jnr_emp_dept'
Joiner Transformation and Right Click --> Edit --> Properties Tab.
a) Select the Join Type:
Normal
b) Select 'Sorted
Input' if the data is already sorted in Sql Override for relation database
or using Sorter transformation before passing to Joiner transformation. If data
is not sorted then do not select it.
c) Go to the 'Condition' tab
for conditional columns. Select the Master columns for drop down for join
condition
Like this:
d) Click 'Apply' and
'OK'.
8). Drag required port
from 'jnr_emp_dept' joiner to 'EMP_DEPT_FILE' Target port as below:
9) Select 'Mapping' from Menu --> Validate.
10) Select 'Repository'
from Menu --> Save.
Note:
- Choose Sorted Input to join sorted data. Using sorted
input can improve performance.
- The Joiner transformation does not match null values.
- A normal or master outer join performs faster than a
full outer or detail outer join.
- We can improve session performance by configuring the
Joiner transformation to use sorted input.
- The Joiner transformation does not match null values.
For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value,
the Integration Service does not consider them a match and does not join
the two rows. To join rows with null values, replace null input with
default values, and then join on the default values.
- You can also join data from output groups of the same transformation, such as the Custom transformation or XML Source Qualifier transformation. Place a Sorter
Why you are taking emp as Master table and Dept as Child table.In emp there is more record than that of Dept.
ReplyDeleteYeah..even same i was thinking..why we have taken emp table as master table..
ReplyDeleteIn formatica... integration service automatically takes,first table dragged ports are taken as master source and second table dragged ports are taken as detail source.
ReplyDeleteIf you want to change you will change it as dept table ports as master source.
But you will not find any difference in session performance because you can have only 14 rows of data.
really i feel good treasure.. to learn
ReplyDeleteusually each Master record in the Joiner Cache will compares with detailed records. if master records are less then the number of hits to master will be less because of which performance increases. - Rajeev.
ReplyDeleteExample has 1 condition and 9 as output port, what would be index and data cache?
ReplyDeleteThis is very good information
ReplyDeleteinformatica online training, informatica training in bangalore, informaitca training
super post keep update you information like this Informatica Online Course
ReplyDeleteThank you.Well it was nice post and very helpful information on Informatica Online Training Hyderabad
ReplyDeletecan we use operators other than '=' in joiner
ReplyDeleteNO
ReplyDelete