Header & Footer
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
Re: Header & Footer
jwiles,
Thanks for your detailed reply. I will try it tomorrow morning. Time is 11 PM...going to sleep now...
Thanks for your detailed reply. I will try it tomorrow morning. Time is 11 PM...going to sleep now...
Karthik
I have a single data source and must create two output files, both needing a header and trailer. I do it all in one job with no intermediate files.
For each output file job stream, I split the data to the three record types (I must use the original input to set some of the columns in the header and trailer). Each split ends with a transformer that concatenates all the columns into one fixed-length Char column. The final stage before creating the file is a Funnel, type Sequential, and I set the link order to header-details-trailer. It was quite simple once I understood the coding requirements at each point.
In case anyone thinks to ask: My source is in xml format. It has two sets of repeating sub-tags, which must be split into separate destinations. My output is fixed-length as required by the next process that uses it.
For each output file job stream, I split the data to the three record types (I must use the original input to set some of the columns in the header and trailer). Each split ends with a transformer that concatenates all the columns into one fixed-length Char column. The final stage before creating the file is a Funnel, type Sequential, and I set the link order to header-details-trailer. It was quite simple once I understood the coding requirements at each point.
In case anyone thinks to ask: My source is in xml format. It has two sets of repeating sub-tags, which must be split into separate destinations. My output is fixed-length as required by the next process that uses it.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
Re: Header & Footer
I tried your method. It is working almost fine except the aggregator stage.karthi_gana wrote:jwiles,
Thanks for your detailed reply. I will try it tomorrow morning. Time is 11 PM...going to sleep now...
As we need to specify 'Properties -->Grouping Keys --> GROUP' atleast one column, i am getting row count for each group (which i don't want).
I just want to show the total no of rows in the footer.
I set the below properties:
Aggregation Type = Count Rows
Count Output column = row_cnt ( which i created in the output column)
I would like to get a single row from the Aggregator. i.e only total row count[/size]
Karthik
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
Re: Header & Footer
I send single non-null field (Just hardcoded the value)to the Aggregator to get the total row count.
It gives the total row count. is there any other way to do this? am i using a kind of short cut to get the total row count?
It gives the total row count. is there any other way to do this? am i using a kind of short cut to get the total row count?
Karthik
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
For Header
Colum_Export Stage
Execution Mode --> Sequential
Preserve partitioning --> Propogate
For Footer
Colum_Export Stage
Execution Mode --> Sequential
Preserve partitioning --> Propogate
For Detail
Colum_Export Stage
Execution Mode --> Parallel
Preserve partitioning --> Set
is it correct?
Colum_Export Stage
Execution Mode --> Sequential
Preserve partitioning --> Propogate
For Footer
Colum_Export Stage
Execution Mode --> Sequential
Preserve partitioning --> Propogate
For Detail
Colum_Export Stage
Execution Mode --> Parallel
Preserve partitioning --> Set
is it correct?
Karthik
Those settings seem appropriate for the logic I suggested, so long as the Funnel Type is Sequence (processing input links in order) and Execution Mode is sequential (not parallel). That will be necessary in order to maintain the correct Header-Detail-Trailer order in your output file.
Actual performance of a job design is hard to predict, especially in a parallel environment. There are many factors that affect performance, covering everything from job design, data sources and targets to system configuration and processing load. If properly implemented, the design I proposed should perform relatively good on a well-managed and configured server. Keep in mind that this is only one method of accomplishing what you need....there are other methods that would work.
If you haven't already, you should invest some time in training for parallel job development. Your employer should be willing to offer something, either directly or through a third party. Also, following Ray's question, co-workers or others you know who already have experience are a great resource.
Good luck!
Regards,
Actual performance of a job design is hard to predict, especially in a parallel environment. There are many factors that affect performance, covering everything from job design, data sources and targets to system configuration and processing load. If properly implemented, the design I proposed should perform relatively good on a well-managed and configured server. Keep in mind that this is only one method of accomplishing what you need....there are other methods that would work.
If you haven't already, you should invest some time in training for parallel job development. Your employer should be willing to offer something, either directly or through a third party. Also, following Ray's question, co-workers or others you know who already have experience are a great resource.
Good luck!
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
In my team,I am the one and only person working on datastage. There are one admin team and some developers in other business group.They are all supporting some other projects. It is very hard to get input from them.
I started to read Parallel Job guide and started to read the other posts here. I hope i can get some good inputs from here.
I started to read Parallel Job guide and started to read the other posts here. I hope i can get some good inputs from here.
Karthik
Very understandable, if unfortunate. Everyone has their own job to do!karthi_gana wrote:In my team,I am the one and only person working on datastage. There are one admin team and some developers in other business group.They are all supporting some other projects. It is very hard to get input from them.
There's a lot of good info available here....it has helped me a lot over the years just searching for information.karthi_gana also wrote:I started to read Parallel Job guide and started to read the other posts here. I hope i can get some good inputs from here.
There is a parallel job tutorial that is included with the documentation. Also, IBM has a redbook or two about parallel job design. There're links available in a recent post (just search for redbook in this forum), and/or you can Google for parallel job design redbook. The PDF's are free for download and are an excellent resource.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Premium Member
- Posts: 729
- Joined: Tue Apr 28, 2009 10:49 pm
This is a more or less generic statement based on limited knowledge of how to use the aggregator stage and of proper parallel job design, and the statement accomplishes nothing but spreading FUD: Fear, Uncertainty and Doubt.karthi_gana wrote:I received an email from my ADMIN team as above yesterday.Using aggregator in the job will hit performance and also involves more job design. If you expect to have this functionality in more number of jobs, it will be easier to use routines.
is it true?
An aggregator, when improperly used, can hurt performance. So can the sort, transformer, join and many other stages--again when improperly used. As to the comment "if you expect to have....it will be easier to use routines": is the admin referring to Basic routines? While the Basic transformer can run in parallel, using it will also hurt performance in a parallel job.
Regarding the comment "involves more job design", I'm certain several of us here on the forum can present examples of how using an aggregator greatly simplifies parallel job design. The design that has been suggested to you is one where the aggregator simplifies the design and the particular aggregation types specified (Count Rows and Sum) would not hurt performance. What was suggested was designed to be an efficient method to accomplish what you requested.
Regards,
- james wiles
All generalizations are false, including this one - Mark Twain.
All generalizations are false, including this one - Mark Twain.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
We believe you. So it is true (for some value of true anyway) that you received an email from your ADMIN tram yesterday.karthi_gana wrote:I received an email from my ADMIN team as above yesterday.
is it true?
As to its assertion, I would be asking for proof. Surely it would depend on who wrote the routine. A well-constructed Aggregator working with properly sorted and partitioned data is a very efficient beast indeed.
Never generalize.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.