flatfile with diffrent sections

mattias.klint · Post by **mattias.klint** » Thu Jan 31, 2008 12:06 pm

Hi,

I have a txt file that comes in diffrent sections:

#section1
1 test1 test2 test3
2 aaaa bbbb cccc
3 uuuu eeee rrrrr

#section2
1 test1 test2 test3 tyty
2 aaaa bbbb cccc tyty
3 uuuu eeee rrrrr tyty
4 test ett s ttett

#economy
1 56 Q1
2 56 Q2

...

The sections will have diffrent names and they will vary in length, can be just one row or many thousands depending on the amount of transactions that day.

I need to seperate each section into diffrent paths to be able to transfer them to specific jobs. Does anyone have any sugestions? I'm using Datastage V8.

It must be possible,

thx, Mattias

DSguru2B · Post by **DSguru2B** » Thu Jan 31, 2008 12:09 pm

Look into the CFF stage. You can make your own cfd pertaining to your data and then based on a column which identifies the section, filter out the records.

crouse · Post by **crouse** » Thu Jan 31, 2008 1:16 pm

I think you'll have trouble with the CFF stage. It needs something on every row to tell it which link to send it out. If the example is actual data, then there is marker row that says what follows.

I'd use a transform stage and stage variables to determine which link to send the data, then the receiving stage has the appropriate meta data for the type of row of data.

Use stage variable(s) to determine what type of rows you are currently processing, then use the stage variable(s) in the link constraints.

Hopefully there is a finite number of different types of rows in the file...

mattias.klint · Post by **mattias.klint** » Thu Jan 31, 2008 1:23 pm

Ok so it is possible, I havent seen the file yet, just wanted to check before I ask for it. I will try as soon as I get it. Most likely there will be some more posts. If it works I will post my answer and close the case.

Thank you!

kcbland · Post by **kcbland** » Thu Jan 31, 2008 1:25 pm

Parse the file using any of the numerous methods discussed on DSXchange (sed, awk, etc or DS job that reads whole row as single column and divides output into different links going to separate files). There's no natural stage that deals with sectioned files. Multi-record format files is handled using the CFF stage, but that's not your situation.

kcbland · Post by **kcbland** » Thu Jan 31, 2008 1:27 pm

mattias.klint wrote:Most likely there will be some more posts.

Ehh? crouse and 2 Premium Posters not enough? Who you waiting for, CHulett?

chulett · Post by **chulett** » Thu Jan 31, 2008 1:34 pm

Well then, off you go.

ray.wurlod · Post by **ray.wurlod** » Thu Jan 31, 2008 4:10 pm

I'd also advocate a Transformer stage, with a stage variable that "remembers" which section you're currently in and which is used in the constraint expressions on the output link. The stage variable is updated only when a section header row is processed.

mattias.klint · Post by **mattias.klint** » Fri Feb 01, 2008 9:22 am

Seems like it's time for me to go Premium...

...have to talk to the boss.

Thanks for the help, I wil try it soon, I hope!

mattias.klint · Post by **mattias.klint** » Wed Feb 13, 2008 5:06 pm

Ok, finaly I got my premium member ship

This is how my data looks like. It comes with one data on each line. So it's diffrent from what I thought.

_______________________________________________________
FILE NAME - Start
SEQ-1 <-----Seq
555
666333
4444

-END

SEQ-2 <-----Seq
2223333
333
5553
333444

66

33

text
-END <------- End of record, not of section

2223333
333
5553
333444

66

33

text
-END

SEQ-3 <-----Seq
5555
444
5

-END
FILE NAME - Stop
_______________________________________________________

First row is FILE NAME - Start
Section 1 starts with SEQ-1 and ends with -END
Section 2 starts with SEQ-2 and ends with -END (important here is that each record also ends with -END
Section 3 starts with SEQ-3 and ends with -END
Last row is FILE NAME - Stop

(the empty rows are supposed to be there)

So, of all your suggestions, which one is best for this kind of structure?

Thanks alot!!!

chulett · Post by **chulett** » Wed Feb 13, 2008 5:45 pm

Well, I'd still stick with reading each record as one long varchar field. You can build whatever logic you need around what happens when you read a 'SEQ' record and what happens when you read an '-END' record.

Stage variables can be used to build an aggregate record between those two points, and then constrain your output to only happen on each '-END' record. Depending on what your target is, you may be better served by first landing the records to a flat file, records you have put proper delimeters in. Then you can read it back and do what you will. Or perhaps skip that step and in a following transformer parse the aggregate record apart in 'normal' link columns using something like the Field() function and load them where you will.

Shouldn't be too hard to set up.

kcbland · Post by **kcbland** » Wed Feb 13, 2008 5:56 pm

Read the file, use stage variables to detect the start and stop of each section. Direct rows to a section specific output link. If there's 5 sections to the file, then have 5 output links. Read the source file using a single column metadata definition to read the entire line of data as a single column.

Separate your source file into single record file format and use followup jobs to process accordingly. Those jobs can read its file using specific column definitions.

mattias.klint · Post by **mattias.klint** » Sun Feb 17, 2008 2:06 pm

Thanks everyone!!!!!!!!!!

It's solved, after all it wasn't that hard. I used stage variables in a transformer. One reason it was easy is because i only have one column. This is my next problem. Now that I have it divided in each section in a separate flat file I need to transfer my rows into columns

A
S
A
S

needs to be

ASAS

I have seen this many times searching for VERTICAL PIVOT, but my problem is that I dont have a key column. But that is another post.

viewtopic.php?p=270486#270486

Thanks every one!

flatfile with diffrent sections

flatfile with diffrent sections

Premium

Re: Premium