need a help on job design
Moderators: chulett, rschirm, roy
need a help on job design
hi all,
could you please share your thoughts for job design of the following reqirement
mine source data is
ID,LOC,PHONE,COUNTRY
1,Home,1234,IND
1,Work,0012,IND
2,Home,0345,AUS
3,Work,0234,JAP
4,Work,1357,USA
4,Home,5678,USA
i need an out put like
ID,HOME_PHONE,WORK_PHONE,COUNTRY
1,1234,0012,IND
2,0345,NULL,AUS
3,NULL,0234,JAP
4,5678,1357,USA
ANY SUGGESTIONS????
could you please share your thoughts for job design of the following reqirement
mine source data is
ID,LOC,PHONE,COUNTRY
1,Home,1234,IND
1,Work,0012,IND
2,Home,0345,AUS
3,Work,0234,JAP
4,Work,1357,USA
4,Home,5678,USA
i need an out put like
ID,HOME_PHONE,WORK_PHONE,COUNTRY
1,1234,0012,IND
2,0345,NULL,AUS
3,NULL,0234,JAP
4,5678,1357,USA
ANY SUGGESTIONS????
D.N .MURTHY
-
- Participant
- Posts: 25
- Joined: Fri Jan 11, 2008 12:49 am
- Location: Pune, India
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 301
- Joined: Thu Jul 14, 2005 10:27 am
- Location: Melbourne, Australia
- Contact:
Just to clarify: You're stating that a specific ID value may be associated with more than one COUNTRY? (In your example data, each ID has only a single COUNTRY associated to it)
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
yes, here id means it's an emp_id field,so in each COUNTRY n number of emp_id's are available.COUNTRY it's not an key fieldjhmckeever wrote:Just to clarify: You're stating that a specific ID value may be associated with more than one COUNTRY? (In your example data, each ID has only a single COUNTRY associated to it)
D.N .MURTHY
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
So Which Country (Home or work) you are planning to put in country field (Output) if they can be different for same Emp ID.dr.murthy wrote:yes, here id means it's an emp_id field,so in each COUNTRY n number of emp_id's are available.COUNTRY it's not an key fieldjhmckeever wrote:Just to clarify: You're stating that a specific ID value may be associated with more than one COUNTRY? (In your example data, each ID has only a single COUNTRY associated to it)
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.
So Which Country (Home or work) you are planning to put in country field (Output) if they can be different for same Emp ID.[/quote]
hi ,
please find my source data format MY SOURCE DATA IS
EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
2,Home,0345,IND
3,Work,0234,JAP
4,Work,1357,USA
1,Work,0012,IND
4,Home,5678,USA
I NEED AN OUT PUT SOME THING LIKE
EMPID,HOME_PHONE,WORK_PHONE,COUNTRY
1,1234,0012,IND
2,0345,NULL,IND
3,NULL,0234,JAP
in source data some empid's doesnot have neither home nor work phone numbers.so i design the job some thing like
SEQFILE---->SWITCH----->
IN SWITCH I SPLITTED THE DATA BASED ON LOC MEANS HOME REC GOES TO ONE LINK AND WORK RECORDS GOES TO SOME OTHER LINK
FINALLY I JOINED THEM BOTH BASED ON EMPID AND RENAME THE FIELD NAMES.SUPPOSE IF THE RECORDS COMMING FROM LOC LINK THEN I RENAMED THE PHONE FIELD AS LOC_PHONE SIMILARLY IF THE RECORDS COMMING FROM HOME LINK IT WAS RENAMED AS HOME_PHONE
O/P GIVES ONLY FOUR FIELDS
EMPID,HOME_PHONE,WORK_PHONE,COUNTRY
BUT ISSUE IS IF SOME RECORDS DOESNOT HAVE HOME PHONE OR WORK PHONE WE ARE NOT ABLE TO HANDLE THOSE RECORDS
hi ,
please find my source data format MY SOURCE DATA IS
EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
2,Home,0345,IND
3,Work,0234,JAP
4,Work,1357,USA
1,Work,0012,IND
4,Home,5678,USA
I NEED AN OUT PUT SOME THING LIKE
EMPID,HOME_PHONE,WORK_PHONE,COUNTRY
1,1234,0012,IND
2,0345,NULL,IND
3,NULL,0234,JAP
in source data some empid's doesnot have neither home nor work phone numbers.so i design the job some thing like
SEQFILE---->SWITCH----->
IN SWITCH I SPLITTED THE DATA BASED ON LOC MEANS HOME REC GOES TO ONE LINK AND WORK RECORDS GOES TO SOME OTHER LINK
FINALLY I JOINED THEM BOTH BASED ON EMPID AND RENAME THE FIELD NAMES.SUPPOSE IF THE RECORDS COMMING FROM LOC LINK THEN I RENAMED THE PHONE FIELD AS LOC_PHONE SIMILARLY IF THE RECORDS COMMING FROM HOME LINK IT WAS RENAMED AS HOME_PHONE
O/P GIVES ONLY FOUR FIELDS
EMPID,HOME_PHONE,WORK_PHONE,COUNTRY
BUT ISSUE IS IF SOME RECORDS DOESNOT HAVE HOME PHONE OR WORK PHONE WE ARE NOT ABLE TO HANDLE THOSE RECORDS
D.N .MURTHY
-
- Participant
- Posts: 25
- Joined: Fri Jan 11, 2008 12:49 am
- Location: Pune, India
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
hi ,dr.murthy wrote:So Which Country (Home or work) you are planning to put in country field (Output) if they can be different for same Emp ID.
please find my source data format MY SOURCE DATA IS
EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
2,Home,0345,IND
3,Work,0234,JAP
4,Work,1357,USA
1,Work,0012,IND
4,Home,5678,USA
I NEED AN OUT PUT SOME THING LIKE
EMPID,HOME_PHONE,WORK_PHONE,COUNTRY
1,1234,0012,IND
2,0345,NULL,IND
3,NULL,0234,JAP
in source data some empid's doesnot have neither home nor work phone numbers.so i design the job some thing like
SEQFILE---->SWITCH----->
IN SWITCH I SPLITTED THE DATA BASED ON LOC MEANS HOME REC GOES TO ONE LINK AND WORK RECORDS GOES TO SOME OTHER LINK
FINALLY I JOINED THEM BOTH BASED ON EMPID AND RENAME THE FIELD NAMES.SUPPOSE IF THE RECORDS COMMING FROM LOC LINK THEN I RENAMED THE PHONE FIELD AS LOC_PHONE SIMILARLY IF THE RECORDS COMMING FROM HOME LINK IT WAS RENAMED AS HOME_PHONE
O/P GIVES ONLY FOUR FIELDS
EMPID,HOME_PHONE,WORK_PHONE,COUNTRY
BUT ISSUE IS IF SOME RECORDS DOESNOT HAVE HOME PHONE OR WORK PHONE WE ARE NOT ABLE TO HANDLE THOSE RECORDS[/quote]
Do you really need the records without any details.
If yes then pass the records having null values in the location field to third link and perform a full outer join on in all three links. But this depends on the requirement as if a value in number is present and location is not present then where to put that number.
EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
1,Work,0345,USA
then what should be the output or you are sure as this will not happen.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
dr.murthy wrote:yes i need all records,for example in my source data empid =2 doesnot have a work phone so i need a output record with work_phone is null.
but in my job design iam not able to capture those records in full outer join.can you please tell me is any other way to design he job.
Code: Select all
If yes then pass the records having null values in the location field to third link and perform a full outer join on in all three links. But this depends on the requirement as if a value in number is present and location is not present then where to put that number.
EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
1,,,IND
as you will get two records which might not be needed.
I cannot explain each and everything. Give the problem a thought based on all the suggestions you have got. Also think on the two probable resolutions you have in this thread.
One More thing is you haven't replied to my question in previous post mentioned in last few lines.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.
hi,
i got a solution finally, i develop a job but it seems very complicated.i had to use four lookups in my job.
as you specified
if the empid =1 doesnot have a work phone number in source only one record is comming like
like
i got a solution finally, i develop a job but it seems very complicated.i had to use four lookups in my job.
as you specified
EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
1,,,IND
if the empid =1 doesnot have a work phone number in source only one record is comming like
but in output i need work phone number is null FORempid=1EMPID,LOC,PHONE,COUNTRY
1,Home,1234,IND
like
IS THERE ANY SIMPLE WAY TO DESIGN THIS JOBEMPID,WRK_PHN,HOME_PHN,LOC
1, NULL ,1234,,IND
D.N .MURTHY
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
No need to shout! (All capitals = shouting)
There is a very simple solution, to which others have alluded and which requires no lookups at all.
The simple design is to use data sorted by ID, assemble the record in a Transformer stage populating the two phone numbers into separate fields with appropriate null handling, and using stage variables to remember the previous record. The output from the Transformer stage is then run through a single Aggregator stage which outputs the Last of each group; the data are grouped on ID. No lookups are required.
This solution is right for server job, because you posted in the server job forum. If it really is a parallel job, substitute a Remove Duplicates stage for the Aggregator stage and make sure that the data are hash or modulus partitioned by ID.
There is a very simple solution, to which others have alluded and which requires no lookups at all.
The simple design is to use data sorted by ID, assemble the record in a Transformer stage populating the two phone numbers into separate fields with appropriate null handling, and using stage variables to remember the previous record. The output from the Transformer stage is then run through a single Aggregator stage which outputs the Last of each group; the data are grouped on ID. No lookups are required.
This solution is right for server job, because you posted in the server job forum. If it really is a parallel job, substitute a Remove Duplicates stage for the Aggregator stage and make sure that the data are hash or modulus partitioned by ID.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.