How to fetch just first and last record from DataStage sequential file?

2 comments:

  1. 1. Read sequential file in parallel.

    2. Enable "ROW NUMBER COLUMN" property of sequential file stage. This will give us the unique number of type BigInt into a separate column. Now name this column as ROWNUM. Remember user has to define this column into the output schema as this can be omitted in RCP is enabled into the design.

    3. Now output of our sequential file stage would go into a transformer. Transformer stage will have two output links where a constraint will be applied on one input to only allow record where ROWNUM = 1. This will give us the first record of the file.

    4. Now the second output of the transformer will be the input for a copy stage where an in-line sort will be performed on ROWNUM column in descending order. You can change the copy stage to run into sequential mode otherwise it would also work into parallel. You can drop the ROWNUM column here.

    5. The sorted output of the copy stage would now be the input of another transformer stage where a stage variable StageVar will be defined as Integer (with 0 as initial value) and would be assigned by StageVar + 1 value. This will assign unique number to the records being passed from the transformer. A constraint will be defined in this transformer stage to filter only those records where StageVar = 1. This will give us the last record from the file.

    6. Both the output from step 3 and 6 will be the input for funnel stage which will ultimately populating the any stage to store the first and last record from a sequential file.

    ReplyDelete
  2. See detailed answer @ http://geekarticles.blogspot.com/2011/01/datastage-how-to-fetch-first-and-last.html

    ReplyDelete