Types of parallelism in detail.

There are 3 types of parallelism in ab-initio.
1) Data Parallelism:
Data is processed at the different servers at the same time.
2) Pipeline parallelism:
In this the records are processed in pipeline, i.e. the components do not have to wait for all the records to be processed. The records that got processed are passed to next component in pipeline.
3) Component Parallelism:
In this two or more components process the records in parallel.

Component parallelism:-
        A graph with multiple processes running simultaneously on
separate data uses component parallelism.

Data parallelism :- 
       A graph that deals with data divided into segments and operates on each segment simultaneously uses data parallelism. Nearly all commercial data processing tasks can use data parallelism. To support this form of parallelism, Ab Initio provides Partition components to segment data, and Departition components to merge segmented data back together .

Pipeline parallelism :- 
      A graph with multiple components running simultaneously on the same data uses pipeline parallelism. Each component in the pipeline continuously
reads from upstream components, processes data, and writes to downstream components. Since a downstream component can process records previously written
by an upstream component, both components can operate in parallel.

NOTE: To limit the number of components running simultaneously, set phases in the graph.





0 comments: