Page 1 of 1

Posted: Mon Mar 20, 2006 2:48 pm
by DSguru2B
Can you give some sample data:
As in, is the header repeating for each Data1?

Posted: Mon Mar 20, 2006 2:56 pm
by Gaurav.Dave
Thanks for your quick response.

There will be distinct header for each subset.....

example, in a single file, I will be getting

header1 ---------------> "1Q05", "2Q05", "3Q05", "4Q05"
underlying Data1------>
header2----------------->"1Q06", "2Q06", "3Q06", "4Q06"
underlying Data2------->


Data1 & Data2 will be different contains.... and expecting to have records like about 100k rows..

Thanks,
Gaurav

Posted: Mon Mar 20, 2006 3:04 pm
by DSguru2B
Is the number of headers going to be static or would they change? and also fi you can provide me a snippet of the orig data.

Posted: Mon Mar 20, 2006 3:28 pm
by Gaurav.Dave
Well, Header records counts will not be fixed...it will be changing....

here are some sample data from the file....


1Q05 2Q05 3Q05 4Q05
06TGT Client Team TELE OO Fed/Exce GMR PUI 804Top Valid Revenue 8.926070 41.575685 12.089471 10.110442
06TGT Client Team TELE OO Fed/Exce GMR GS 804Top Valid Revenue 8.926070 41.575685 12.089471 10.110442

1Q06 2Q06 3Q06 4Q06
0625TGT Op Ident OO Unassigned Fed/Exce GMR 5S CR Leads 686.801830 925.533652 940.060153 1605.605841
0625TGT Op Ident OO Unassigned Fed/Exce GMR CC CR Leads 160.566881 228.354777 106.455053 184.601798

Posted: Mon Mar 20, 2006 4:24 pm
by martin
Hi,

Read data as single column
Write to 2 output links
Output link1 on constraint do substring Col[1,5] = '06TGT'
Output link2 on constraint do substring Col[1,7] = '0625TGT'
with this you can create 2 seperate files.

Goodluck
Martin

Posted: Mon Mar 20, 2006 6:15 pm
by rasi
Hi

I think that '06TGT' & '0625TGT' is not constant. A good approach is to read file sequentially and assign stage variable to identify the pattern of the the header record. Send data to link 1 unless you dedect a new header patter which will send output to Link 2.

This should help you.

Posted: Mon Mar 20, 2006 7:42 pm
by ray.wurlod
Depending on exactly what you want the results to be, you might consider using the Rejects link in the Sequential File stage to capture the header lines, and process them after converting from raw format (if, indeed, you need to process them at all). This way only detail rows will appear on the main output of the Sequential File stage.