Header & Footer

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Header & Footer

Post by karthi_gana »

All,
How to add Header & Footer for a text file. say for example, i have the below file.

fnd_id fnd_symbol mth_end_dt return
1 QQQ 31/jan/2010 2.54334
1 QQQ 28/feb/2010 1.21314
1 QQQ 31/mar/2010 0.3425
1 QQQ 30/apr/2010 1.78658
1 QQQ 31/may/2010 0.11231
1 QQQ 30/jun/2010 0.11232
5 RRR 31/jan/2010 0.54334
5 RRR 28/feb/2010 0.21314
5 RRR 31/mar/2010 0.3425
5 RRR 30/apr/2010 0.78658
5 RRR 31/may/2010 0.11231
5 RRR 30/jun/2010 0.11232

Expected Output:

LHDR|20110122 -- current date
QQQ|31/jan/2010|2.54334
QQQ|28/feb/2010|1.21314
QQQ|31/mar/2010|0.3425
QQQ|30/apr/2010|1.78658
QQQ|31/may/2010|0.11231
...
...
...
RRR|30/jun/2010|0.11232
UHDR|10 -- RowCount from the file
Karthik
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Typically, create three files and then cat them together in the proper order after job. That or build the proper output in three streams and funnel them together at the end.
-craig

"You can never have too many knives" -- Logan Nine Fingers
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Re: Header & Footer

Post by karthi_gana »

I did a search here and came to know to create 3 files separately.

1) HEADER
2)DETAIL
3)FOOTER

But i don't know do i need to create 3 jobs to create these 3 files?
Karthik
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Re: Header & Footer

Post by jwiles »

You could use three jobs, but you do not need to. As already mentioned, in one job you can either:

1) Create all three files
2) Create only the final file (with header and trailer)

The logic to create the header and trailer records will be identical in either case, the difference is whether the job outputs one file or three files. If three, you will need to combine them afterwards to create the final file.

There are two major disadvantages to creating three separate files:

1) You will temporarily need twice the amount of space to store the output data until the files have been combined, and
2) Time will be required to combine the files as you are simply copying the data into another file.

If you're dealing with high volumes of data these may be an issue.

Allowing the job to create the final file requires only a little more logic than creating three separate files and wouldn't have the disadvantages listed above.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Re: Header & Footer

Post by karthi_gana »

how should i proceed the second method?
Karthik
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Re: Header & Footer

Post by karthi_gana »

karthi_gana wrote:how should i proceed the second method?
i have tried the below options.

Code: Select all

         

              sequential file (header)
                       |
                       |
                       v
ODBC --> transformer --> sequential file (detail)
                      |
                      |
                      v
                sequential file (footer)

sequential file (header)
has two columns

header1 varchar(5) -- 'UHDR'
header2 varchar(12) -- currentdate()


In the transofer stage, i used a constraint @INROWNUM = 1
Just to store one row. but the issue is, i see three rows in the header file. how?
Karthik
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

How many nodes are you running the job on?
-craig

"You can never have too many knives" -- Logan Nine Fingers
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

chulett wrote:How many nodes are you running the job on?
can you tell me how should i find it?
Karthik
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

karthi_gana wrote:
chulett wrote:How many nodes are you running the job on?
can you tell me how should i find it?
Check Job Log at Datastage Director.
Cheers
Ravi K
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

Ravi.K wrote:
karthi_gana wrote:
chulett wrote:How many nodes are you running the job on?
can you tell me how should i find it?
Check Job Log at Datastage Director.
i don't see it in the director log.
Karthik
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Re: Header & Footer

Post by jwiles »

karthi_gana wrote:how should i proceed the second method?
One way is the following:

For the header: A Row generator to create a single header record. The date can be parameterized, or you can follow the rowgen by a transformer to populate with CurrentDate().

For the footer: Output a second link with only an Integer (value 1) to an aggregator to count rows (or sum the integer). Either run this aggregator in sequential mode or follow by another in sequential mode to sum the row counts. Follow with a column gen or transformer to format the trailer record.

For the data: Create as you do now.

All three: If the header/footer must have a different number of columns (i.e. column separators) than the data, then do this: Place a column exporter in all three streams. Use these to export the columns into a single varchar column containing your final output record, formatted with separators, etc.
If the header/footer can have the same layout as the data, create the additional columns to match the data record layout.

Funnel the three record types together using a Sequence funnel, which allows you to choose the order in which the input links are processed. Process the links in Header, Data, Footer sequence.

Follow the sequence funnel with your Sequential File stage.

Code: Select all

(Header)  RowGen-->[Tform]-->[Col Export]----------
                                                  |
                                                  \/
(Data)    ODBC---->Tform---->[Col Export]------->Funnel---->SeqFile
                   |                              /\
                   \/                             |
(Footer)          Agg-->[Agg]-->ColGen/Tform-->[Col Export]

[stage] indicates the optional stages (see above text).
For header stream, all stages should run in sequential mode.
For footer stream, all stages from final Aggregator on should run in sequential mode.
Funnel should run in sequential mode.

The important parts are 1) Generating only one header and footer row (the reason for sequential mode operation of several stages); 2) record schemas must match for the funnel or you will drop columns; 3) Funnel processes the links in the correct order

I hope this description makes sense to you. If not, or you're uncomfortable with the logic, the three-file method will certainly work and you won't have to worry about record formats matching prior to the funnel.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

karthi_gana wrote:
Ravi.K wrote:
karthi_gana wrote: can you tell me how should i find it?
Check Job Log at Datastage Director.
i don't see it in the director log.

Through Adminitrator and "APT_CONFIG_FILE" environment variable you can find the Configuration file name.

Go to Datastage Manager --> Tools --> Configurations --> Serach for the appropriate name of the Configuration file.

NOTE: There is a chance of changing the Configuration file at Job level also. Check in that angle also at Job properties.
Cheers
Ravi K
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

karthi_gana wrote:i don't see it in the director log.
The job's log will show the configuration file that it runs with, early in the log for a job run.

Regards,
- james wiles


All generalizations are false, including this one - Mark Twain.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

That's pretty fundamental PX stuff and should be something you just know. So, you have no idea what number of nodes are defined in your default config? Or how to override that if you need to? Is there someone there that can enlighten you? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
karthi_gana
Premium Member
Premium Member
Posts: 729
Joined: Tue Apr 28, 2009 10:49 pm

Post by karthi_gana »

Hi chulett,

Actually i am creating parallel job for first time. so i am having lot of doubts and questions.
Karthik
Post Reply