Runtime Column Propagation in DataStage
WHAT IS RCP IN DATASTAGE?
InfoSphere DataStage is also flexible about meta data. It can handle the situation in case meta data is not fully defined.
When we send the data from source to the target, sometimes we need to send only required columns. You can define part of your schema and specify that, if your job encounters additional columns that are not defined in the meta data when it actually runs, it will adopt these extra columns and propagate them through the rest of the job. Which is called as Runtime Column Propagation (RCP).
RCP can be enabled for a project via the Administrator client, and set for individual links via the Output Page Columns tab for most stages, or in the Output page General tab for Transformer stages.
RCP Enable/Disable done at:
Project level: in Administrator project properties
Job level: Job properties General tab
Stage/s: Link Output Column tab
If run time column propagation is enabled in the DataStage Administrator, you can select the Run time column propagation to specify that columns encountered by a stage in a parallel job can be used even if they are not explicitly defined in the meta data. You should always ensure that run time column propagation is turned on if you want to use schema files to define column meta data.
Run time column propagation is used in case of partial schema usage. When we only know about the columns to be processed and we want all other columns to be propagated to target as they are.
USING RCP WITH SEQUENTIAL STAGES
Runtime column propagation (RCP) allows DataStage to be flexible about the columns you define in a job.
If RCP is enabled for a project, you can just define the columns you are interested in using in a job, but ask DataStage to propagate the other columns through the various stages.
So such columns can be extracted from the data source and end up on your data target without explicitly being operated on in between.
Sequential files, unlike most other data sources, do not have inherent column definitions, and so DataStage cannot always tell where there are extra columns that need propagating.
You can only use RCP on sequential files if you have used the Schema File property to specify a schema which describes all the columns in the sequential file.
You need to specify the same schema file for any similar stages in the job where you want to propagate columns. Stages that will require a schema file are:
Sequential File
File Set
External Source
External Target
Column Import
Column Export
Comments
Post a Comment