Header & Footer

karthi_gana · Post by **karthi_gana** » Fri Jan 21, 2011 9:01 pm

All,
How to add Header & Footer for a text file. say for example, i have the below file.

fnd_id fnd_symbol mth_end_dt return
1 QQQ 31/jan/2010 2.54334
1 QQQ 28/feb/2010 1.21314
1 QQQ 31/mar/2010 0.3425
1 QQQ 30/apr/2010 1.78658
1 QQQ 31/may/2010 0.11231
1 QQQ 30/jun/2010 0.11232
5 RRR 31/jan/2010 0.54334
5 RRR 28/feb/2010 0.21314
5 RRR 31/mar/2010 0.3425
5 RRR 30/apr/2010 0.78658
5 RRR 31/may/2010 0.11231
5 RRR 30/jun/2010 0.11232

Expected Output:

LHDR|20110122 -- current date
QQQ|31/jan/2010|2.54334
QQQ|28/feb/2010|1.21314
QQQ|31/mar/2010|0.3425
QQQ|30/apr/2010|1.78658
QQQ|31/may/2010|0.11231
...
...
...
RRR|30/jun/2010|0.11232
UHDR|10 -- RowCount from the file

chulett · Post by **chulett** » Fri Jan 21, 2011 9:16 pm

Typically, create three files and then cat them together in the proper order after job. That or build the proper output in three streams and funnel them together at the end.

karthi_gana · Post by **karthi_gana** » Fri Jan 21, 2011 9:17 pm

I did a search here and came to know to create 3 files separately.

1) HEADER
2)DETAIL
3)FOOTER

But i don't know do i need to create 3 jobs to create these 3 files?

jwiles · Post by **jwiles** » Fri Jan 21, 2011 10:03 pm

You could use three jobs, but you do not need to. As already mentioned, in one job you can either:

1) Create all three files
2) Create only the final file (with header and trailer)

The logic to create the header and trailer records will be identical in either case, the difference is whether the job outputs one file or three files. If three, you will need to combine them afterwards to create the final file.

There are two major disadvantages to creating three separate files:

1) You will temporarily need twice the amount of space to store the output data until the files have been combined, and
2) Time will be required to combine the files as you are simply copying the data into another file.

If you're dealing with high volumes of data these may be an issue.

Allowing the job to create the final file requires only a little more logic than creating three separate files and wouldn't have the disadvantages listed above.

Regards,

karthi_gana · Post by **karthi_gana** » Thu Jan 27, 2011 4:41 am

how should i proceed the second method?

karthi_gana · Post by **karthi_gana** » Thu Jan 27, 2011 5:40 am

karthi_gana wrote:how should i proceed the second method?

i have tried the below options.

Code: Select all

         

              sequential file (header)
                       |
                       |
                       v
ODBC --> transformer --> sequential file (detail)
                      |
                      |
                      v
                sequential file (footer)

sequential file (header)
has two columns

header1 varchar(5) -- 'UHDR'
header2 varchar(12) -- currentdate()

In the transofer stage, i used a constraint @INROWNUM = 1
Just to store one row. but the issue is, i see three rows in the header file. how?

chulett · Post by **chulett** » Thu Jan 27, 2011 6:33 am

How many nodes are you running the job on?

karthi_gana · Post by **karthi_gana** » Thu Jan 27, 2011 8:12 am

chulett wrote:How many nodes are you running the job on?

can you tell me how should i find it?

Ravi.K · Post by **Ravi.K** » Thu Jan 27, 2011 8:22 am

karthi_gana wrote:
chulett wrote:How many nodes are you running the job on?
can you tell me how should i find it?

Check Job Log at Datastage Director.

karthi_gana · Post by **karthi_gana** » Thu Jan 27, 2011 8:26 am

Ravi.K wrote:
karthi_gana wrote:
chulett wrote:How many nodes are you running the job on?
can you tell me how should i find it?
Check Job Log at Datastage Director.

i don't see it in the director log.

jwiles · Post by **jwiles** » Thu Jan 27, 2011 8:36 am

karthi_gana wrote:how should i proceed the second method?

One way is the following:

For the header: A Row generator to create a single header record. The date can be parameterized, or you can follow the rowgen by a transformer to populate with CurrentDate().

For the footer: Output a second link with only an Integer (value 1) to an aggregator to count rows (or sum the integer). Either run this aggregator in sequential mode or follow by another in sequential mode to sum the row counts. Follow with a column gen or transformer to format the trailer record.

For the data: Create as you do now.

All three: If the header/footer must have a different number of columns (i.e. column separators) than the data, then do this: Place a column exporter in all three streams. Use these to export the columns into a single varchar column containing your final output record, formatted with separators, etc.
If the header/footer can have the same layout as the data, create the additional columns to match the data record layout.

Funnel the three record types together using a Sequence funnel, which allows you to choose the order in which the input links are processed. Process the links in Header, Data, Footer sequence.

Follow the sequence funnel with your Sequential File stage.

Code: Select all

(Header)  RowGen-->[Tform]-->[Col Export]----------
                                                  |
                                                  \/
(Data)    ODBC---->Tform---->[Col Export]------->Funnel---->SeqFile
                   |                              /\
                   \/                             |
(Footer)          Agg-->[Agg]-->ColGen/Tform-->[Col Export]

[stage] indicates the optional stages (see above text).

For header stream, all stages should run in sequential mode.
For footer stream, all stages from final Aggregator on should run in sequential mode.
Funnel should run in sequential mode.

The important parts are 1) Generating only one header and footer row (the reason for sequential mode operation of several stages); 2) record schemas must match for the funnel or you will drop columns; 3) Funnel processes the links in the correct order

I hope this description makes sense to you. If not, or you're uncomfortable with the logic, the three-file method will certainly work and you won't have to worry about record formats matching prior to the funnel.

Regards,

Ravi.K · Post by **Ravi.K** » Thu Jan 27, 2011 8:42 am

karthi_gana wrote:
Ravi.K wrote:
karthi_gana wrote: can you tell me how should i find it?
Check Job Log at Datastage Director.
i don't see it in the director log.

Through Adminitrator and "APT_CONFIG_FILE" environment variable you can find the Configuration file name.

Go to Datastage Manager --> Tools --> Configurations --> Serach for the appropriate name of the Configuration file.

NOTE: There is a chance of changing the Configuration file at Job level also. Check in that angle also at Job properties.

jwiles · Post by **jwiles** » Thu Jan 27, 2011 8:52 am

karthi_gana wrote:i don't see it in the director log.

The job's log will show the configuration file that it runs with, early in the log for a job run.

Regards,

chulett · Post by **chulett** » Thu Jan 27, 2011 9:01 am

That's pretty fundamental PX stuff and should be something you just know. So, you have no idea what number of nodes are defined in your default config? Or how to override that if you need to? Is there someone there that can enlighten you?

karthi_gana · Post by **karthi_gana** » Thu Jan 27, 2011 11:17 am

Hi chulett,

Actually i am creating parallel job for first time. so i am having lot of doubts and questions.

DSXchange

Header & Footer

Header & Footer

Re: Header & Footer

Re: Header & Footer

Re: Header & Footer

Re: Header & Footer

Re: Header & Footer