Friday, April 5, 2013

SQL Transformation

SQL Transformation is an Active/Passive and Connected transformation.

The SQL transformation processes SQL queries midstream in a pipeline. You can insert, delete, update, and retrieve rows from a database. You can pass the database connection information to the SQL transformation as input data at run time. The transformation processes external SQL scripts or SQL queries that you create in an SQL editor. The SQL transformation processes the query and returns rows and database errors.

When you create an SQL transformation, you configure the following options:
  • Mode. The SQL transformation runs in one of the following modes:
1.    Script mode. The SQL transformation runs ANSI SQL scripts that are externally located. You pass a script name to the transformation with each input row. The SQL transformation outputs one row for each input row.

2.    Query mode. The SQL transformation executes a query that you define in a query editor. You can pass strings or parameters to the query to define dynamic queries or change the selection parameters. You can output multiple rows when the query has a SELECT statement.


  • Passive or active transformation. The SQL transformation is an active transformation by default. You can configure it as a passive transformation when you create the transformation.
  • Database type. The type of database the SQL transformation connects to.
  • Connection type. Pass database connection information to the SQL transformation or use a connection object.
Connection type. Pass database connection information to the SQL transformation or use a connection object.
Script Mode
An SQL transformation running in script mode runs SQL scripts from text files. You pass each script file name from the source to the SQL transformation ScriptName port. The script file name contains the complete path to the script file.

When you configure the transformation to run in script mode, you create a passive transformation. The transformation returns one row for each input row. The output row contains results of the query and any database error.

When the SQL transformation runs in script mode, the query statement and query data do not change. When you need to run different queries in script mode, you pass the scripts in the source data. Use script mode to run data definition queries such as creating or dropping tables.

When you configure an SQL transformation to run in script mode, the Designer adds the ScriptName input port to the transformation. When you create a mapping, you connect the ScriptName port to a port that contains the name of a script to execute for each row. You can execute a different SQL script for each input row. The Designer creates default ports that return information about query results.

An SQL transformation configured for script mode has the following default ports:

Port
Type
Description
ScriptName
Input
Receives the name of the script to execute for the current row.
ScriptResult
Output
Returns PASSED if the script execution succeeds for the row. Otherwise contains FAILED.
ScriptError
Output
Returns errors that occur when a script fails for a row.

Rules and Guidelines for Script Mode
Use the following rules and guidelines for an SQL transformation that runs in script mode:
  • You can use a static or dynamic database connection with script mode.
  • To include multiple query statements in a script, you can separate them with a semicolon.
  • You can use mapping variables or parameters in the script file name.
  • The script code page defaults to the locale of the operating system. You can change the locale of the script.
  • The script file must be accessible by the Integration Service. The Integration Service must have read permissions on the directory that contains the script. If the Integration Service uses operating system profiles, the operating system user of the operating system profile must have read permissions on the directory that contains the script.
  • The Integration Service ignores the output of any SELECT statement you include in the SQL script. The SQL transformation in script mode does not output more than one row of data for each input row.
  • You cannot use scripting languages such as Oracle PL/SQL or Microsoft/Sybase T-SQL in the script.
  • You cannot use nested scripts where the SQL script calls another SQL script.
  • A script cannot accept run-time arguments.
Query Mode
When an SQL transformation runs in query mode, it executes an SQL query that you define in the transformation. You pass strings or parameters to the query from the transformation input ports to change the query statement or the query data.

When you configure the SQL transformation to run in query mode, you create an active transformation. The transformation can return multiple rows for each input row.

Create queries in the SQL transformation SQL Editor. To create a query, type the query statement in the SQL Editor main window. The SQL Editor provides a list of the transformation ports that you can reference in the query. You can double-click a port name to add it as a query parameter.

When you create a query, the SQL Editor validates the port names in the query. It also verifies that the ports you use for string substitution are string datatypes. The SQL Editor does not validate the syntax of the SQL query.

You can create the following types of SQL queries in the SQL transformation:
  • Static SQL query. The query statement does not change, but you can use query parameters to change the data. The Integration Service prepares the query once and runs the query for all input rows.
  • Dynamic SQL query. You can change the query statements and the data. The Integration Service prepares a query for each input row.
When you create a static query, the Integration Service prepares the SQL procedure once and executes it for each row. When you create a dynamic query, the Integration Service prepares the SQL for each input row. You can optimize performance by creating static queries.

Rules and Guidelines for Query Mode

Use the following rules and guidelines when you configure the SQL transformation to run in query mode:
  • The number and the order of the output ports must match the number and order of the fields in the query SELECT clause.
  • The native datatype of an output port in the transformation must match the datatype of the corresponding column in the database. The Integration Service generates a row error when the datatypes do not match.
  • When the SQL query contains an INSERT, UPDATE, or DELETE clause, the transformation returns data to the SQLError port, the pass-through ports, and the NumRowsAffected port when it is enabled. If you add output ports the ports receive NULL data values.
  • When the SQL query contains a SELECT statement and the transformation has a pass-through port, the transformation returns data to the pass-through port whether or not the query returns database data. The SQL transformation returns a row with NULL data in the output ports.
  • You cannot add the "_output" suffix to output port names that you create.
  • You cannot use the pass-through port to return data from a SELECT query.
  • When the number of output ports is more than the number of columns in the SELECT clause, the extra ports receive a NULL value.
  • When the number of output ports is less than the number of columns in the SELECT clause, the Integration Service generates a row error.
  • You can use string substitution instead of parameter binding in a query. However, the input ports must be string datatypes.
Static SQL Queries
Create a static SQL query when you need to run the same query statements for each input row, but you want to change the data in the query for each input row. When you create a static SQL query, you use parameter binding in the SQL Editor to define parameters for query data.

To change the data in the query, configure query parameters and bind them to input ports in the transformation. When you bind a parameter to an input port, you identify the port by name in the query. The SQL Editor encloses the name in question marks (?). The query data changes based on the value of the data in the input port.

The SQL transformation input ports receive the data for the data values in the query, or the values in the WHERE clause of the query.

The following static queries use parameter binding:
DELETE FROM Employee WHERE Dept = ?Dept?
INSERT INTO Employee(Employee_ID, Dept) VALUES (?Employee_ID?, ?Dept?)
UPDATE Employee SET Dept = ?Dept? WHERE Employee_ID > 100

The following static SQL query has query parameters that bind to the Employee_ID and Dept input ports of an SQL transformation:
SELECT Name, Address FROM Employees WHERE Employee_Num=?Employee_ID? and Dept = ?Dept?

The source might have the following rows:
Employee_ID
Dept
100
Products
123
HR
130
Accounting

The Integration Service generates the following query statements from the rows:
SELECT Name, Address FROM Employees WHERE Employee_ID = ‘100’ and DEPT = ‘Products’
SELECT Name, Address FROM Employees WHERE Employee_ID = ‘123’ and DEPT = ‘HR’
SELECT Name, Address FROM Employees WHERE Employee_ID = ‘130’ and DEPT = ‘Accounting’

Dynamic SQL Queries
A dynamic SQL query can execute different query statements for each input row. When you create a dynamic SQL query, you use string substitution to define string parameters in the query and link them to input ports in the transformation.

To change a query statement, configure a string variable in the query for the portion of the query you want to change. To configure the string variable, identify an input port by name in the query and enclose the name with the tilde (~). The query changes based on the value of the data in the port. The transformation input port that contains the query parameter must be a string datatype. You can use string substitution to change the query statement and the query data.

When you create a dynamic SQL query, the Integration Service prepares a query for each input row. You can pass the full query or pass part of the query in an input port:
  • Full query. You can substitute the entire SQL query with query statements from source data.
  • Partial query. You can substitute a portion of the query statement, such as the table name.
Full Query

You can pass the full SQL query through an input port in the transformation. To pass the full query, create a query in the SQL Editor that consists of one string variable to represent the full query:

~Query_Port~

The transformation receives the query in the Query_Port input port.

The following figure shows ports in the SQL transformation:

The Integration Service replaces the ~Query_Port~ variable in the dynamic query with the SQL statements from the source. It prepares the query and sends it to the database to process. The database executes the query. The SQL transformation returns database errors to the SQLError port.

The following mapping shows how to pass the query to the SQL transformation:


When you pass the full query, you can pass more than one query statement for each input row. For example, the source might contain the following rows:
DELETE FROM Person WHERE LastName = ‘Jones’; INSERT INTO Person (LastName, Address) VALUES ('Smith', '38 Summit Drive')
DELETE FROM Person WHERE LastName = ‘Jones’; INSERT INTO Person (LastName, Address) VALUES ('Smith', '38 Summit Drive')
DELETE FROM Person WHERE LastName = ‘Russell’;

You can pass any type of query in the source data. When you configure SELECT statements in the query, you must configure output ports for the database columns you retrieve from the database. When you mix SELECT statements and other types of queries, the output ports that represent database columns contain null values when no database columns are retrieved.

Partial Query
You can substitute the table name in a query. To substitute the table name, configure an input port to receive the table name from each input row. Identify the input port by name in the query and enclose the name with the tilde (~).

The following dynamic query contains a string variable, ~Table_Port~:

SELECT Emp_ID, Address from ~Table_Port~ where Dept = ‘HR’

The source might pass the following values to the Table_Port column:
Table_Port
Employees_USA
Employees_England
Employees_Australia

The Integration Service replaces the ~Table_Port~ variable with the table name in the input port:
SELECT Emp_ID, Address from Employees_USA where Dept = ‘HR’
SELECT Emp_ID, Address from Employees_England where Dept = ‘HR’
SELECT Emp_ID, Address from Employees_Australia where Dept = ‘HR’

SQL Transformation Properties
After you create the SQL transformation, you can define ports and set attributes in the following transformation tabs:
  • Ports. Displays the transformation ports and attributes that you create on the SQL Ports tab.
  • Properties. SQL transformation general properties.
  • SQL Settings. Attributes unique to the SQL transformation.
  • SQL Ports. SQL transformation ports and attributes.
Note: You cannot update the columns on the Ports tab. When you define ports on the SQL Ports tab, they display on the Ports tab.

Connecting to Databases
You can use a static database connection or you can pass database connection information to the SQL transformation at run time.

You can use a connection environment SQL statement or transactional SQL statement with the SQL transformation. Configure the SQL statements in a relational connection object. The Integration Service runs the connection environment SQL when it connects to the database. It runs the transactional SQL statement before the initiation of each transaction.

Use one of the following types of connections to connect the SQL transformation to a database:
  • Static connection. Configure the connection object in the session. You must first create the connection object in Workflow Manager.
  • Logical connection. Pass a connection name to the SQL transformation as input data at run time. You must first create the connection object in Workflow Manager.
  • Full database connection. Pass the connect string, user name, password, and other connection information to the SQL transformation input ports at run time.
Note: If a session has multiple partitions, the SQL transformation creates a separate database connection for each partition.

Transaction Control
An SQL transformation that runs in script mode drops any incoming transaction boundary from an upstream source or transaction generator. The Integration Service issues a commit after executing the script for each input row in the SQL transformation. The transaction contains the set of rows affected by the script.

  • An SQL transformation that runs in query mode commits transactions at different points based on the database connection type: 
  • Dynamic database connection. The Integration Service issues a commit after executing the SQL for each input row. The transaction is the set of rows affected by the script. You cannot use a Transaction Control transformation with dynamic connections in query mode.
  • Static connection. The Integration Service issues a commit after processing all the input rows. The transaction includes all the database rows to update. You can override the default behavior by using a Transaction Control transformation to control the transaction, or by using commit and rollback statements in the SQL query.When you configure an SQL statement to commit or rollback rows, configure the SQL transformation to generate transactions with the Generate Transaction transformation property. Configure the session for user-defined commit. 
The following transaction control SQL statements are not valid with the SQL transformation:
  • SAVEPOINT. Identifies a rollback point in the transaction.
  • SET TRANSACTION. Changes transaction options.
 Input Row to Output Row Cardinality
When the Integration Service runs a SELECT query, the SQL transformation returns a row for each row it retrieves. When the query does not retrieve data, the SQL transformation returns zero or one row for each input row.

The number of output rows the SQL transformation returns depends on the following factors:
  • Query statement processing. When the query contains a SELECT statement, the Integration Service can retrieve multiple output rows. When a SELECT query is successful, the SQL transformation might retrieve multiple rows. When the query contains other statements, the Integration Service might generate a row that contains SQL errors or the number of rows affected.
  • Port configuration. The NumRowsAffected output port contains the total number of rows affected by updates, inserts, or deletes for one input row. When the SQL transformation contains pass-through ports, the transformation returns the column data at least once for each source row.
  • The maximum row count configuration. The Max Output Row Count limits the number of rows the SQL transformation returns from SELECT queries.
  • Error rows. The Integration Service returns row errors when it encounters connection or syntax errors. When the SQL transformation runs in query mode, it returns errors to the SQLError port. When the SQL transformation runs in script mode, it returns errors to the ScriptError port.
  • Continue on SQL Error. You can configure the SQL transformation to continue processing when there is an error in a SQL statement. The SQL transformation does not generate a row error.
Sample Mapping for Script Mode:

Using script mode we are trying to create a table or insert data using script file. We need to flat file one which contain create or insert script you want to execute and second which will have the script path and the file name.





1. Connect and Open the folder if not already opened.
2. Select Tools --> Mapping Designer
3. Select Mappings --> Create

It will pop-up "Mapping Name". Enter the mapping name of your choice" and Click on 'OK'. Example: m_sql_script_mode
4. Drag the Source and Target definitions into workspace if they are already exist. If not click hereto know how to create or import Table definitions.
5. Select 'Transformation' from Menu --> Create
a)  That will appear you 'Select the transformation type to create:
b) Select 'SQL' from drop down and 'Enter a new name for this transformation:' as 'sql_script_mode'
Note: Mode: Script
          DB Type: Oracle
          DB Connection: Static

c) Click 'OK' and Click 'Done'

or

a) Click on SQL Transformation icon marked below in below snapshot.
b) Click in the workspace in Mapping Designer.
Note: Mode: Script
          DB Type: Oracle
          DB Connection: Static

c) Click 'OK' and Click 'Done'
d) Select 'SQL' transformation in the workspace and right click --> Edit
e) In Transformation tab --> Click on 'Rename' highlighted above which will pop-up 'Rename Transformation'. Enter the Transformation Name: 'sql_script_mode'
f) Click on 'OK'
g) Click on 'Apply' and 'OK'.
6). Drag port from 'SQ_Dummy' Source Qualifier to 'sql_script_mode' sql  Transformation.
a) Select 'sql_script_mode' transformation and right click on it --> Edit
b) Go to SQL Settings tab.
Check Enable Auto Commit
Scripts Locale: MS Windows Latin 1(ANSI), superset of Latin1.

SQL Settings Tab
Configure SQL transformation attributes on the SQL Settings tab. The SQL attributes are unique to the SQL transformation.

The following table lists the attributes you can configure on the SQL Setting tab:
Option
Description
Continue on SQL Error within row
Continues processing the remaining SQL statements in a query after an SQL error occurs.
Add Statistic Output Port
Adds a NumRowsAffected output port. The port returns the total number of database
rows affected by INSERT, DELETE, and UPDATE query statements for an input row.
AutoCommit
Enables auto-commit for each database connection. Each SQL statement in a query defines a transaction. A commit occurs when the SQL statement completes or the next statement is executed, whichever comes first
Max Output Row Count
Defines the maximum number of rows the SQL transformation can output from a SELECT query. To configure unlimited rows, set Max Output Row Count to zero.
Scripts Locale
Identifies the code page for a SQL script. Choose the code page from the list. Default is operating system locale.
Use Connection Pooling
Maintain a connection pool for database connections. You can enable connection pooling for dynamic connections only.
Maximum Number of Connections in Pool
Maximum number of active connections available in the connection pool. Minimum is one. Maximum is 20. Default is 10.

C) Go to SQL Ports tab
d) Click on Open Browser and select: Script Name from Parameter Binding.
e)  Click 'OK'
f) Click 'Apply' and 'OK'
 
7) Select 'Mapping' from Menu --> Validate.
8) Select 'Repository' from Menu --> Save.

Note:
Please define DBconnection to the SQL transformation at session level to execute the query into database.

1. Select the session and right click on it to edit --> Go to the Mapping tab:
2. Select 'SQL' transformation in navigator.
3. Click 'Apply' and 'OK'

Sample Mapping for Query Mode:

Using query mode we are trying to load data from  flat file to Oracle target table.

1. Connect and Open the folder if not already opened.
2. Select Tools --> Mapping Designer
3. Select Mappings --> Create
It will pop-up "Mapping Name". Enter the mapping name of your choice" and Click on 'OK'. Example: m_sql_script_query
 
4. Drag the Source and Target definitions into workspace if they are already exist. If not click hereto know how to create or import Table definitions.
5. Select 'Transformation' from Menu --> Create
a) That will appear you 'Select the transformation type to create:'
b) Select 'SQL' from drop down and 'Enter a new name for this transformation:' as 'sql_query_mode'
Note: Mode: Query
          DB Type: Oracle
          DB Connection: Static

c) Click 'OK' and 'Done'
6). Select the 'sql_query_mode' sql Transformation
a) Add ports EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM and DEPTNO as below.
b) Click on Open Browser from SQL Query.
Write the Insert script:
Insert into training.EMP_TBL (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values (?NEWFIELD1?,?NEWFIELD2?,?NEWFIELD3?,?NEWFIELD4?,?NEWFIELD5?,?NEWFIELD6?,?NEWFIELD7?,?NEWFIELD8?)

Note: User the port name from parameter binding.
Click 'OK'
Click 'Apply' and 'OK'

7) Drag the required ports from SQ_EMP to sql_query_mode and from sql_query_mode to target instance as below.
8) Select 'Mapping' from Menu --> Validate.
9) Select 'Repository' from Menu --> Save.

Note:
Please define DBconnection to the SQL transformation at session level to execute the query into database.

1. Select the session and right click on it to edit --> Go to the Mapping tab:
2. Select 'SQL' transformation in navigator.


3. Click 'Apply' and 'OK'

7 comments:

  1. Gowtham Can you please upload the XML and Source files

    ReplyDelete
  2. Thanks for publish a very wonderful post.

    ReplyDelete
  3. It is the best blog to have a detailed idea of informatica. Thanks Gowtham. Great Job

    ReplyDelete
  4. After reading this blog i very strong in this topics and this blog really helpful to all..Informatica Online Course Hyderabad

    ReplyDelete
  5. Great software is supported by great people. Our user base is dedicated to inspire use of the language, and committed to being diverse and friendly.


    devops course in bangalore

    best devops training in bangalore

    Devops certification training in bangalore

    devops training in bangalore

    devops training institute in bangalore

    ReplyDelete
  6. Your blog is in a convincing manner, thanks for sharing such an information with lots of your effort and time
    sql server dba online training India
    sql server dba online training Hyderabad

    ReplyDelete
  7. It's such nice content! Thanks for sharing content and such nice information for me. I hope you will share some more content about Data Analytics Tools
    . Please keep sharing!

    ReplyDelete