flatfile with diffrent sections

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
mattias.klint
Premium Member
Premium Member
Posts: 43
Joined: Wed Oct 18, 2006 6:03 am

flatfile with diffrent sections

Post by mattias.klint »

Hi,

I have a txt file that comes in diffrent sections:

#section1
1 test1 test2 test3
2 aaaa bbbb cccc
3 uuuu eeee rrrrr

#section2
1 test1 test2 test3 tyty
2 aaaa bbbb cccc tyty
3 uuuu eeee rrrrr tyty
4 test ett s ttett

#economy
1 56 Q1
2 56 Q2

...

The sections will have diffrent names and they will vary in length, can be just one row or many thousands depending on the amount of transactions that day.

I need to seperate each section into diffrent paths to be able to transfer them to specific jobs. Does anyone have any sugestions? I'm using Datastage V8.

It must be possible,

thx, Mattias
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Look into the CFF stage. You can make your own cfd pertaining to your data and then based on a column which identifies the section, filter out the records.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
crouse
Charter Member
Charter Member
Posts: 204
Joined: Sun Oct 05, 2003 12:59 pm
Contact:

Post by crouse »

I think you'll have trouble with the CFF stage. It needs something on every row to tell it which link to send it out. If the example is actual data, then there is marker row that says what follows.

I'd use a transform stage and stage variables to determine which link to send the data, then the receiving stage has the appropriate meta data for the type of row of data.

Use stage variable(s) to determine what type of rows you are currently processing, then use the stage variable(s) in the link constraints.

Hopefully there is a finite number of different types of rows in the file...
Craig Rouse
Griffin Resouces, Inc
www.griffinresources.com
mattias.klint
Premium Member
Premium Member
Posts: 43
Joined: Wed Oct 18, 2006 6:03 am

Post by mattias.klint »

Ok so it is possible, I havent seen the file yet, just wanted to check before I ask for it. I will try as soon as I get it. Most likely there will be some more posts. If it works I will post my answer and close the case.

Thank you!
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Parse the file using any of the numerous methods discussed on DSXchange (sed, awk, etc or DS job that reads whole row as single column and divides output into different links going to separate files). There's no natural stage that deals with sectioned files. Multi-record format files is handled using the CFF stage, but that's not your situation.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

mattias.klint wrote:Most likely there will be some more posts.
Ehh? crouse and 2 Premium Posters not enough? Who you waiting for, CHulett? :lol:
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well then, off you go. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I'd also advocate a Transformer stage, with a stage variable that "remembers" which section you're currently in and which is used in the constraint expressions on the output link. The stage variable is updated only when a section header row is processed.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mattias.klint
Premium Member
Premium Member
Posts: 43
Joined: Wed Oct 18, 2006 6:03 am

Premium

Post by mattias.klint »

Seems like it's time for me to go Premium...

...have to talk to the boss.

Thanks for the help, I wil try it soon, I hope!
mattias.klint
Premium Member
Premium Member
Posts: 43
Joined: Wed Oct 18, 2006 6:03 am

Re: Premium

Post by mattias.klint »

Ok, finaly I got my premium member ship :-)

This is how my data looks like. It comes with one data on each line. So it's diffrent from what I thought.

_______________________________________________________
FILE NAME - Start
SEQ-1 <-----Seq
555
666333
4444



-END

SEQ-2 <-----Seq
2223333
333
5553
333444

66


33

text
-END <------- End of record, not of section

2223333
333
5553
333444

66


33

text
-END

SEQ-3 <-----Seq
5555
444
5

-END
FILE NAME - Stop
_______________________________________________________

First row is FILE NAME - Start
Section 1 starts with SEQ-1 and ends with -END
Section 2 starts with SEQ-2 and ends with -END (important here is that each record also ends with -END
Section 3 starts with SEQ-3 and ends with -END
Last row is FILE NAME - Stop

(the empty rows are supposed to be there)

So, of all your suggestions, which one is best for this kind of structure?

Thanks alot!!!
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Well, I'd still stick with reading each record as one long varchar field. You can build whatever logic you need around what happens when you read a 'SEQ' record and what happens when you read an '-END' record.

Stage variables can be used to build an aggregate record between those two points, and then constrain your output to only happen on each '-END' record. Depending on what your target is, you may be better served by first landing the records to a flat file, records you have put proper delimeters in. Then you can read it back and do what you will. Or perhaps skip that step and in a following transformer parse the aggregate record apart in 'normal' link columns using something like the Field() function and load them where you will.

Shouldn't be too hard to set up. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Read the file, use stage variables to detect the start and stop of each section. Direct rows to a section specific output link. If there's 5 sections to the file, then have 5 output links. Read the source file using a single column metadata definition to read the entire line of data as a single column.

Separate your source file into single record file format and use followup jobs to process accordingly. Those jobs can read its file using specific column definitions.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
mattias.klint
Premium Member
Premium Member
Posts: 43
Joined: Wed Oct 18, 2006 6:03 am

Post by mattias.klint »

Thanks everyone!!!!!!!!!!

It's solved, after all it wasn't that hard. I used stage variables in a transformer. One reason it was easy is because i only have one column. This is my next problem. Now that I have it divided in each section in a separate flat file I need to transfer my rows into columns

A
S
A
S

needs to be

ASAS

I have seen this many times searching for VERTICAL PIVOT, but my problem is that I dont have a key column. But that is another post.

viewtopic.php?p=270486#270486

Thanks every one!
Post Reply