Interwier Question
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 4
- Joined: Wed May 15, 2013 1:25 am
Interwier Question
This is a question I was asked in an interview and I am not exactly able to figure out its answer. There is a file with data
1
1
2
3
4
4
The output should be 2 files. First with data as
1
1
4
4
and the other as
2
3.
The reply I gave was to use an aggregator and then filter but that will populate the duplicate records only once. I also tried key change value in sort but same output.
Please help!!
1
1
2
3
4
4
The output should be 2 files. First with data as
1
1
4
4
and the other as
2
3.
The reply I gave was to use an aggregator and then filter but that will populate the duplicate records only once. I also tried key change value in sort but same output.
Please help!!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
There are several ways this might be accomplished. In a real life example it would depend on the wider picture - the rule that determines which rows (which key values) go into which files. In most cases you'd be looking at constraint expressions in a Transformer stage, or you might be looking at using a Filter or Switch stage. For the given example you might even try something funky with partitioning.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 353
- Joined: Mon Jan 17, 2011 5:03 am
- Location: Mumbai, India
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I assume the scenario is to sort the data into a duplicates file and a unique row file. I would use the Transformer LastRowInGroup function and sort and partition the data by the key field. Whenever this function returns a value of FALSE you have a duplicate key value and the row that follows belongs in the same group. You may need a Stage Variable counter and a constraint to output all rows in a group down one link and single rows down another link.
This was a good addition to DataStage 8.5, it effectively lets you peer into the future and compare the current row to the next incoming row, something you couldn't do in earlier versions.
This was a good addition to DataStage 8.5, it effectively lets you peer into the future and compare the current row to the next incoming row, something you couldn't do in earlier versions.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Too much code.
Constraint 1: InLink.KeyChange And LastRowInGroup(InLink.keycol)
Constraint 2: Otherwise/Log
No need to generate 1s and 0s - you're just wasting CPU cycles by doing so.
I now invite you to inspect your code and YOU answer the question whether it meets the original poster's requirements.
Constraint 1: InLink.KeyChange And LastRowInGroup(InLink.keycol)
Constraint 2: Otherwise/Log
No need to generate 1s and 0s - you're just wasting CPU cycles by doing so.
I now invite you to inspect your code and YOU answer the question whether it meets the original poster's requirements.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Participant
- Posts: 35
- Joined: Mon May 06, 2013 5:59 am
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
KeyChange comes from the upstream Sort stage in what I was thinking.
You can generate key change detection within the Transformer stage using two stage variables, one to detect the change and another to "remember" the key from the previous row.
You can generate key change detection within the Transformer stage using two stage variables, one to detect the change and another to "remember" the key from the previous row.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.