Page 1 of 1

Separately Group Rows then Re-group them in Exact Order

Posted: Thu May 21, 2015 12:56 am
by KFajardo
Hi guys, I am new to Datastage and I encountered a problem that's maybe a simple one.

I have a flat file source that looks like the one below:

Type 1|XXX|XXX|1234
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|567
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|1234
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|8911
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX

* Type 1~6 Rows is refers to 1 instance/project.
BUT the only traceability of the project is only located at Type 1's
* So my problem is, if I separate all rows, I will not know from which project does each OTHER TYPEs belong to.

*I want to perform a transformation on Type 1 that's why I want to separate them. But there will be a bigger problem when doing so. That's why I'm thinking if there's a way that I can separate each project and group them so that the OTHER TYPES wo'nt get lost.

Hoping for the help.

Thank you,
-Kit

Posted: Thu May 21, 2015 4:15 am
by ShaneMuir
Welcome.

If the data is as simple as below - you could use a stage variable to geneate a surrogate key value each time a new Type1 is received and then apply that to each other type until such time as the next Type1 is received. That surrogate key could be a number, or the "project" name itself.

So read the flat file in sequentially (to ensure that all the project information ends up on the same node) and then in a transformer stage set a stage variable to identify when the incoming record is a type1.
When that field is a type 1 set your surrogate key, then apply that surrogate key as a new output column to each input row. Each time a new type1 arrives update your surrogate key value.

Hope this helps.

Posted: Thu May 21, 2015 1:30 pm
by KFajardo
Thank you, Shanemuir.

I think you answered the question. But I don't have an idea on how to implement it. Can you shed some light on giving an example or a formula so that I can visualize on how the stage variable will work. Because I only know simple stage variables for now and I don't know to implement a stage variable that will toke effect until another TYPE1 will occur and then get a substring of the next Project name.

Again, thank you. You are really helping me on this.

-keith

Posted: Thu May 21, 2015 3:57 pm
by KFajardo
Will i use a Loop here? I have searched the net on my problem and looping seems to be a possible solution. Can you help me on this?

Hoping for your generous help.

-keith

Posted: Thu May 21, 2015 4:51 pm
by ray.wurlod
I don't think there's any need for a loop. Create the stage variable as, say, type Integer with an initial value of 0. Then, each time the value "Type 1" occurs, increment it.

Code: Select all

svGroupNumber  <--  If InLink.Col1 = "Type 1" Then svGroupNumber + 1 Else svGroupNumber

Posted: Thu May 21, 2015 5:04 pm
by KFajardo
but i need the project# (that is in each type 1)
to be iterated for each row.
example:

input
Type 1|xx|project01
Type 2|xxxxx
Type 3|xxxxxx
Type 1|xx|project02
Type 2|xxxxx
Type 3|xxxxxx


it must be:
project01|Type 1|xx|project01
project01|Type 2|xxxxx
project01|Type 3|xxxxxx
project02|Type 1|xx|project02
project02|Type 2|xxxxx
project02|Type 3|xxxxxx

Posted: Thu May 21, 2015 5:07 pm
by ray.wurlod

Code: Select all

svProject <--  If InLink.Col1 = "Type 1" Then InLink.Project Else svProject

Posted: Sat May 23, 2015 2:07 pm
by KFajardo
This is only a single column singe it is from a flat file. I want to populate the row with the "project" that is located on every type1-line only. So, i want to check on what project does every Other Types (type2, type3, etc...) belongs to.

Posted: Sat May 23, 2015 4:28 pm
by ray.wurlod
Yes? Use the value of the svProject stage variable, which contains the project from the most recently-encountered Type 1 record.

Posted: Sun May 24, 2015 12:45 am
by KFajardo
Ah! Now I get it! svProect's value is set into a new value everytime the it encounters a "Type1"! Thank you so much ray.wurlod!
I can now distinguish every project separately.

With all my heart, thanks again!

Keith