DataStage Architecture

DataStage follows the client-server architecture. The different versions of DataStage have different types of client-server architecture. 



Client Components: DataStage is divided into below client components:

Administrator
This component of DataStage provides a user interface for administrating projects.  It also manages global settings and maintains interactions with various systems. The Administrator’s role ranges from setting up users and project properties to adding, moving and deleting projects. It specifies general server defaults and purging criteria.  A command interface is provided by Administrator for DataStage Repository.  It plays a crucial role in managing job scheduling options, user privileges, setting up parallel job defaults and specifying job monitoring limits.

Manager
To view and edit the contents of DataStage repository, the DataStage Manager is considered to be the main interface of the DataStage repository. Whether you want to browse the DataStage repository or store and manage reusable Meta data, DataStage Manager renders all these services. Tables and files layouts, jobs and transforms routines which are defined in the project are displayed by it.  It has a crucial role in managing all the tasks related to DataStage repository.

Designer
The designer helps in creating DataStage jobs or application by providing a design interface.  These jobs are then complied to form executable programs.  Each job explicitly specifies the source of data, required transforms and the destination of data as well.  DataStage Director is responsible for scheduling the executables which are created from compiling these jobs. Designer also provides a user friendly graphical interface. The server takes care of running these executable programs.  This module is used by developers. The extraction, cleansing, transformation, integration and loading of data is performed via a visual data flow method.

Director
As mentioned earlier, DataStage Director provides an interface which schedules executable programs formed by the compilation of jobs.  It runs, validates, schedules and monitors server jobs and parallel jobs. The Director interface plays a vital role in parallel processing.  The main users of this interface are testers and operators.

Note: In latest versions of DataStage Manager component is combined into DataStage Director.

Server Components: DataStage is divided into below server components:

Engine tier
The engine tier includes the logical group of components (the InfoSphere Information Server engine components, service agents, and so on) and the computer where those components are installed. The engine runs jobs and other tasks for product modules.

Services tier
The services tier includes the application server, common services, and product services for the suite and product modules, and the computer where those components are installed. The services tier provides common services (such as metadata and logging) and services that are specific to certain product modules. The services tier also hosts InfoSphere Information Server applications that are web-based.

Metadata repository tier
The metadata repository tier includes the metadata repository. The metadata repository contains the shared metadata, data, and configuration information for InfoSphere Information Server product modules.



Comments

Post a Comment

Popular Posts