What is the difference between Lookup, Join and Merge stage?

1 comment:

  1. Lookup - give an opportunity to refer the data without sorting it. If the reference data is small and manageable in the memory. In this scenario this stage can give boost to the performance of the job. This stage is also to merge the information from two different sources.

    Join - This is again for joining the information from two or more sources however primary requirement for this stage is to have the informations sorted before joining them. This requirement some time can be an overhead for the job. This requirement also differ this stage from Lookup.

    Merge - As the name suggest the stage is to merge the two or more sources also this stage needs the sorted data input. The benefit of this stage on above join stage is that in this stage we can have the reject links which is not possible in the join stage however it is possible in the lookup.

    ReplyDelete