Friday, 4 May 2012

Dimension Table Vs Fact Table


Dimension Table features
·         It provides the context /descriptive information for a fact table measurements.
·          Provides entry points to data.
·          Structure of Dimension - Surrogate key one or more other fields that compose the natural key (nk) and set of  attributes.
·          Size of Dimension Table is smaller than Fact Table.
·          In a schema more number of dimensions are presented than Fact Table.
·          Surrogate Key is used to prevent the primary key (pk) violation(store historical data).
·          Values of fields are in numeric and text representation.
Fact Table features
·         It provides measurement of an enterprise.
·         Measurement is the amount determined by observation.
·         Structure of Fact Table - foreign key (fk) Degenerated Dimension and Measurements.
·         Size of Fact Table is larger than Dimension Table.
·         In a schema less number of Fact Tables observed compared to Dimension Tables.
·         Compose of Degenerate Dimension fields act as Primary Key.
·         Values of the fields always in numeric or integer form.
The main difference between dimension and the fact table is that Dimension preserves the historical data (like in case of type2) we will have to use update strategy and other transformations to make that happen but fact will be a direct load with few one or more lookups from the dimension and also since the fact and dimenision has the foriegn key relationship the dimension has to be loaded first before the fact.

I think there won't be any logic difference in a mapping to load dimension table & fact table. We can load the dimension table directly but we can't load the fact table first. So to load the fact table we need to load the dimension table first. Also while loading the fact table we will make a lookup on the dimensioin table cause the fact table contains the measures/facts & the foreign keys which are primary keys in the dimension tables surrounded to that fact table. We can load the dimension table & fact table in one mapping by using the Target Load Order/Target Load Plan in informatica.

Target 1 (Dimension Table)
Target 2 (Fact Table)

Dimention Table - A pure dimention table is a collection of primary keys
Fact Table - A pure fact table is collection of foreign keys.
Fact table contains numeric facts. i.e. key performence indicatiors. A dimention table is a primary key foregin key relation to fact tbale.

To load the fact table we need to load the dimension table first. Also while loading the fact table we will make a lookup on the dimensioin table, cause the fact table contains the measures/facts & the foreign keys which are primary keys in the dimension tables surrounded to that fact table. We can load the dimension table & fact table in one mapping by using the "Target Load Order/Target Load Plan" in informatica.

Wednesday, 25 April 2012

Steps to start a pmcmd command to run a workflow from informatica server


 Steps to start a pmcmd command to run a workflow from informatica server

Generally we get below error while executing PMCMD first time. This error actually tells that utility you are trying to execute is not available. 
/usr/lib/hpux64/dld.so: Unable to find library '1:/home/build3p /PMReleases/90/ HP –   UX.IPF.64/xml /pmxerces2_7_0 /lib/libpmxerces-c.sl.27'.Killed
Step 1: Check if you have PMCMD available. 
If you have informatica installed on your system that doesn’t mean you can execute this utility also, Use below command to find out this utility availability for your informatica installation. If not, Talk to your admin to check that your organization license has this utility in it or not.  
find . -name "PMCMD" –print , Command need to be executed at INFA application  root directory.
Step 2: If PMCMD is available.
  Check if execute access is there for all. User group and others should have execute access to this utility.
Step 3: Configure SHLIB_PATH environment variable.
Bottom line, To run this utility, you got to configure the SHLIB_PATH environment variable to include below directory (Path may differ for your setup).
/appl/inform/Informatica/9.0.1/server/bin
Configuring shlib_path variable will enable you to run this utility from your UNIX ID in a particular UNIX environment. 
You need to execute below two commands at command prompt to configure shlib_path variable. 
SHLIB_PATH=$PATH:/appl/inform/Informatica/ 9.0.1/server/bin
export SHLIB_PATH
Now your setup is ready, You can use syntax given in next step to  run start workflow with this utility. 
Step 4:  Syntax : PMCMD Start Workflow 
pmcmd startworkflow
–uv USER_TEST
–pv PASSWORD_TEST  # See comments for more info on Uname/Password.
–s ITG_SERVICE
–f WF_FOLDER
–paramfile ‘\$PMROOTDIR/PARAM_FILE.TXT’
–wait WF_INFA_WF_TO_INVOKE 
For easy understanding I have kept every pmcmd startworkflow  parameter attributes in new line. You need to remove newline character and keep only one space before each parameter. At last your command will look like one shown in below picture.
Informatica PMCMD Script Syntax
Informatica PMCMD Script Syntax
 You are done, you can execute PMCMD.


If you are alreadyin pmcmd prompt then follow the below mentioned steps

The following commands were helpful for pmcmd:
pmcmd>connect -sv Service -d domain -u username -p password
pmcmd>startworkflow -f 'folder' workflow
pmcmd>getworkflowdetails -f 'folder' -rin workflow
pmcmd>gettaskdetails -f 'folder' workflow
pmcmd>stoptask -f 'folder' workflow
pmcmd>getsessionstatistics -f 'folder' workflow
pmcmd>scheduleworkflow -f 'folder' workflow
pmcmd>unscheduleworkflow -f 'folder' workflow
pmcmd>disconnect -sv Service -d domain -u username -p password

pmcmd : Unix command to control Informatica Workflow

pmcmd command is used to control informatica repository events thru Unix, When informatica server is an Unix server. Since Unix is a very powerful and command end to interact with multiple data storage sources (flatfile, oracle, sql server, xml etc.). We can check for a indicator file and execute/stop/abort a session/workflow.In this way, we can reduce various task/commands and whole dataflow will be automated at an extend.
Session specific,
pmcmd stoptask -s $server_name -u $user_name -p $pass -f $folder -w $wkf $sess
Keys :
stoptask : To stop a task
gettaskdetails : To retrieve task details
starttask : To Execute a task
aborttask : To abort a task
waittask : To Run a task in wait mode
workflow specific,
pmcmd startworkflow-s $server_name -u $user_name -p $pass -f $folder $wkf
Keys :
stopworkflow : To stop a workflow
getworkflowdetails : To retrieve workflow details
startworkflow : To Execute a workflow
abortworkflow : To abort a workflow
waitworkflow : To Run a workflow in wait mode

pmcmd command with parameter file
pmcmd startworkflow-s $server_name -u $user_name -p $pass -f $folder -paramfile paramfile_name -wait $wkf

Apart from these pmcmd command options , various utility options available
(check informatica help file for details)

Friday, 13 April 2012

SQL Transformation

Ø  We can pass the database connection information to the SQL transformation as input data at run time.
Ø  The transformation processes external SQL scripts or SQL queries that you create in an SQL editor.
Ø  The SQL transformation processes the query and returns rows and database errors.

When you create an SQL transformation, you configure the following options:
Mode:-  The SQL transformation runs in one of the following modes:
*      Script mode.
o   The SQL transformation runs ANSI SQL scripts that are externally located. You pass a script name to the transformation with each input row. The SQL transformation outputs one row for each input row.
*      Query mode.
o   The SQL transformation executes a query that you define in a query editor. You can pass strings or parameters to the query to define dynamic queries or change the selection parameters. You can output multiple rows when the query has a SELECT statement.
*      Passive or active transformation.
o   The SQL transformation is an active transformation by default. You can configure it as a passive transformation when you create the transformation.
*      Database type.
o   The type of database the SQL transformation connects to.
*      Connection type
o   Pass database connection information to the SQL transformation or use a connection object.
Script Mode
An SQL transformation running in script mode runs SQL scripts from text files. You pass each script file name from the source to the SQL transformation Script Name port. The script file name contains the complete path to the script file.
When you configure the transformation to run in script mode, you create a passive transformation. The transformation returns one row for each input row. The output row contains results of the query and any database error.
Rules and Guidelines for Script Mode
Use the following rules and guidelines for an SQL transformation that runs in script mode:
  • You can use a static or dynamic database connection with script mode.
  • To include multiple query statements in a script, you can separate them with a semicolon.
  • You can use mapping variables or parameters in the script file name.
  • The script code page defaults to the locale of the operating system. You can change the locale of the script.
  • The script file must be accessible by the Integration Service. The Integration Service must have read permissions on the directory that contains the script.
  • The Integration Service ignores the output of any SELECT statement you include in the SQL script. The SQL transformation in script mode does not output more than one row of data for each input row.
  • You cannot use scripting languages such as Oracle PL/SQL or Microsoft/Sybase T-SQL in the script.
  • You cannot use nested scripts where the SQL script calls another SQL script.
  • A script cannot accept run-time arguments.
Query Mode
  • When you configure the SQL transformation to run in query mode, you create an active transformation.
  • When an SQL transformation runs in query mode, it executes an SQL query that you define in the transformation.
  • You pass strings or parameters to the query from the transformation input ports to change the query statement or the query data.
You can create the following types of SQL queries in the SQL transformation:
  • Static SQL query. The query statement does not change, but you can use query parameters to change the data. The Integration Service prepares the query once and runs the query for all input rows.
  •  Dynamic SQL query. You can change the query statements and the data. The Integration Service prepares a query for each input row.
Rules and Guidelines for Query Mode
Use the following rules and guidelines when you configure the SQL transformation to run in query mode:
  • The number and the order of the output ports must match the number and order of the fields in the query SELECT clause.
  • The native data type of an output port in the transformation must match the data type of the corresponding column in the database. The Integration Service generates a row error when the data types do not match.
  • When the SQL query contains an INSERT, UPDATE, or DELETE clause, the transformation returns data to the SQL Error port, the pass-through ports, and the Num Rows Affected port when it is enabled. If you add output ports the ports receive NULL data values.
  • When the SQL query contains a SELECT statement and the transformation has a pass-through port, the transformation returns data to the pass-through port whether or not the query returns database data. The SQL transformation returns a row with NULL data in the output ports.
  • You cannot add the "_output" suffix to output port names that you create.
  • You cannot use the pass-through port to return data from a SELECT query.
  • When the number of output ports is more than the number of columns in the SELECT clause, the extra ports receive a NULL value.
  • When the number of output ports is less than the number of columns in the SELECT clause, the Integration Service generates a row error.
  • You can use string substitution instead of parameter binding in a query. However, the input ports must be string data types.
SQL Transformation Properties
After you create the SQL transformation, you can define ports and set attributes in the following transformation tabs:
  •  Ports. Displays the transformation ports and attributes that you create on the SQL Ports tab.
  •  Properties. SQL transformation general properties.
  •  SQL Settings. Attributes unique to the SQL transformation.
  •  SQL Ports. SQL transformation ports and attributes.
Note: You cannot update the columns on the Ports tab. When you define ports on the SQL Ports tab, they display on the Ports tab.
Properties Tab
Configure the SQL transformation general properties on the Properties tab. Some transformation properties do not apply to the SQL transformation or are not configurable.
The following table describes the SQL transformation properties:

Property
Description
Run Time Location
Enter a path relative to the Integration Service node that runs the SQL transformation session.
If this property is blank, the Integration Service uses the environment variable defined on the Integration Service node to locate the DLL or shared library.
You must copy all DLLs or shared libraries to the run-time location or to the environment variable defined on the Integration Service node. The Integration Service fails to load the procedure when it cannot locate the DLL, shared library, or a referenced file.
Tracing Level
Sets the amount of detail included in the session log when you run a session containing this transformation. When you configure the SQL transformation tracing level to Verbose Data, the Integration Service writes each SQL query it prepares to the session log.
Is Partition able
Multiple partitions in a pipeline can use this transformation. Use the following options:
- No. The transformation cannot be partitioned. The transformation and other transformations in the same pipeline are limited to one partition. You might choose No if the transformation processes all the input data together, such as data cleansing.
- Locally. The transformation can be partitioned, but the Integration Service must run all partitions in the pipeline on the same node. Choose Locally when different partitions of the transformation must share objects in memory.
- Across Grid. The transformation can be partitioned, and the Integration Service can distribute each partition to different nodes.
Default is No.
Update Strategy
The transformation defines the update strategy for output rows. You can enable this property for query mode SQL transformations.
Default is disabled.
Transformation Scope
The method in which the Integration Service applies the transformation logic to incoming data. Use the following options:
- Row
- Transaction
- All Input
Set transaction scope to transaction when you use transaction control in static query mode.
Default is Row for script mode transformations.Default is All Input for query mode transformations.
Output is Repeatable
Indicates if the order of the output data is consistent between session runs.
- Never. The order of the output data is inconsistent between session runs.
- Based On Input Order. The output order is consistent between session runs when the input data order is consistent between session runs.
- Always. The order of the output data is consistent between session runs even if the order of the input data is inconsistent between session runs.
Default is Never.
Generate Transaction
The transformation generates transaction rows. Enable this property for query mode SQL transformations that commit data in an SQL query.
Default is disabled.
Requires Single
Thread Per Partition
Indicates if the Integration Service processes each partition of a procedure with one thread.
Output is Deterministic
The transformation generate consistent output data between session runs. Enable this property to perform recovery on sessions that use this transformation.
Default is enabled.




Create Mapping :
Step 1: Creating a flat file and importing the source from the flat file.
  • Create a Notepad and in it create a table by name bikes with three columns and three records in it.
  • Create one more notepad and name it as path for the bikes. Inside the Notepad just type in (C:\bikes.txt) and save it.
  • Import the source (second notepad) using the source->import from the file. After which we are goanna get a wizard with three subsequent windows and follow the on screen instructions to complete the process of importing the source.
Step 2: Importing the target and applying the transformation.
In the same way as specified above go to the targets->import from file and select an empty notepad under the name targetforbikes (this is one more blank notepad which we should create and save under the above specified name in the C :\).
  • Create two columns in the target table under the name report and error.
  • We are all set here. Now apply the SQL transformation.
  • In the first window when you apply the SQL transformation we should select the script mode.
  • Connect the SQ to the ScriptName under inputs and connect the other two fields to the output correspondingly.
Snapshot for the above discussed things is given below.

Step 3: Design the work flow and run it.
  • Create the task and the work flow using the naming conventions.
  • Go to the mappings tab and click on the Source on the left hand pane to specify the path for the output file.

Step 4: Preview the output data on the target table.