Separately Group Rows then Re-group them in Exact Order
Moderators: chulett, rschirm, roy
Separately Group Rows then Re-group them in Exact Order
Hi guys, I am new to Datastage and I encountered a problem that's maybe a simple one.
I have a flat file source that looks like the one below:
Type 1|XXX|XXX|1234
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|567
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|1234
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|8911
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
* Type 1~6 Rows is refers to 1 instance/project.
BUT the only traceability of the project is only located at Type 1's
* So my problem is, if I separate all rows, I will not know from which project does each OTHER TYPEs belong to.
*I want to perform a transformation on Type 1 that's why I want to separate them. But there will be a bigger problem when doing so. That's why I'm thinking if there's a way that I can separate each project and group them so that the OTHER TYPES wo'nt get lost.
Hoping for the help.
Thank you,
-Kit
I have a flat file source that looks like the one below:
Type 1|XXX|XXX|1234
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|567
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|1234
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
Type 1|XXX|XXX|8911
Type 2|X|XXXXXXXXX|2015-02-01
Type 3|XXXX|XXXXXXX|2015-02-01
Type 4|XXXXXXX|XX
Type 6|X|XXX|X|XX
* Type 1~6 Rows is refers to 1 instance/project.
BUT the only traceability of the project is only located at Type 1's
* So my problem is, if I separate all rows, I will not know from which project does each OTHER TYPEs belong to.
*I want to perform a transformation on Type 1 that's why I want to separate them. But there will be a bigger problem when doing so. That's why I'm thinking if there's a way that I can separate each project and group them so that the OTHER TYPES wo'nt get lost.
Hoping for the help.
Thank you,
-Kit
Welcome.
If the data is as simple as below - you could use a stage variable to geneate a surrogate key value each time a new Type1 is received and then apply that to each other type until such time as the next Type1 is received. That surrogate key could be a number, or the "project" name itself.
So read the flat file in sequentially (to ensure that all the project information ends up on the same node) and then in a transformer stage set a stage variable to identify when the incoming record is a type1.
When that field is a type 1 set your surrogate key, then apply that surrogate key as a new output column to each input row. Each time a new type1 arrives update your surrogate key value.
Hope this helps.
If the data is as simple as below - you could use a stage variable to geneate a surrogate key value each time a new Type1 is received and then apply that to each other type until such time as the next Type1 is received. That surrogate key could be a number, or the "project" name itself.
So read the flat file in sequentially (to ensure that all the project information ends up on the same node) and then in a transformer stage set a stage variable to identify when the incoming record is a type1.
When that field is a type 1 set your surrogate key, then apply that surrogate key as a new output column to each input row. Each time a new type1 arrives update your surrogate key value.
Hope this helps.
Thank you, Shanemuir.
I think you answered the question. But I don't have an idea on how to implement it. Can you shed some light on giving an example or a formula so that I can visualize on how the stage variable will work. Because I only know simple stage variables for now and I don't know to implement a stage variable that will toke effect until another TYPE1 will occur and then get a substring of the next Project name.
Again, thank you. You are really helping me on this.
-keith
I think you answered the question. But I don't have an idea on how to implement it. Can you shed some light on giving an example or a formula so that I can visualize on how the stage variable will work. Because I only know simple stage variables for now and I don't know to implement a stage variable that will toke effect until another TYPE1 will occur and then get a substring of the next Project name.
Again, thank you. You are really helping me on this.
-keith
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I don't think there's any need for a loop. Create the stage variable as, say, type Integer with an initial value of 0. Then, each time the value "Type 1" occurs, increment it.
Code: Select all
svGroupNumber <-- If InLink.Col1 = "Type 1" Then svGroupNumber + 1 Else svGroupNumber
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
but i need the project# (that is in each type 1)
to be iterated for each row.
example:
input
Type 1|xx|project01
Type 2|xxxxx
Type 3|xxxxxx
Type 1|xx|project02
Type 2|xxxxx
Type 3|xxxxxx
it must be:
project01|Type 1|xx|project01
project01|Type 2|xxxxx
project01|Type 3|xxxxxx
project02|Type 1|xx|project02
project02|Type 2|xxxxx
project02|Type 3|xxxxxx
to be iterated for each row.
example:
input
Type 1|xx|project01
Type 2|xxxxx
Type 3|xxxxxx
Type 1|xx|project02
Type 2|xxxxx
Type 3|xxxxxx
it must be:
project01|Type 1|xx|project01
project01|Type 2|xxxxx
project01|Type 3|xxxxxx
project02|Type 1|xx|project02
project02|Type 2|xxxxx
project02|Type 3|xxxxxx
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Code: Select all
svProject <-- If InLink.Col1 = "Type 1" Then InLink.Project Else svProject
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: