one record for each value which are seperated by delimitter
Moderators: chulett, rschirm, roy
one record for each value which are seperated by delimitter
Hi all,
I am having data in the following format.
EMP_ID| EMP_ROLE| EMP_SKILLS|
1| PM|c;c++;UNIX|
2|SPM|Java;PHP|
3|GPM|PERL|
I want the data in the following format
EMP_ID| EMP_ROLE|EMP_SKILLS|
1|PM |c|
1|PM| c++|
1|PM|UNIX|
2|SPM|Java|
2|SPM| PHP|
3|GPM|PERL|
That is, i want one record for each value in the last columns which are separated by semicolons.
Your suggestions would be of great help.
Thanks in advance,
Regards,
Raja R P
I am having data in the following format.
EMP_ID| EMP_ROLE| EMP_SKILLS|
1| PM|c;c++;UNIX|
2|SPM|Java;PHP|
3|GPM|PERL|
I want the data in the following format
EMP_ID| EMP_ROLE|EMP_SKILLS|
1|PM |c|
1|PM| c++|
1|PM|UNIX|
2|SPM|Java|
2|SPM| PHP|
3|GPM|PERL|
That is, i want one record for each value in the last columns which are separated by semicolons.
Your suggestions would be of great help.
Thanks in advance,
Regards,
Raja R P
-
- Premium Member
- Posts: 783
- Joined: Mon Jan 16, 2006 10:17 pm
- Location: Sydney, Australia
-
- Participant
- Posts: 9
- Joined: Tue Sep 20, 2005 2:10 am
If the number of 'columns' in the field EMP_SKILLS is not very high you can achieve your goal within a normal Transform-stage by using the field-function.
- Make as many outpulinks as the maximum number of 'columns' you need to extract from the field EMP_FIELDS.
- Output the necessary columns and the first Field of EMP_SKILLS (the part before the first semicolon) to the first output-link
- Output the necessary columns and the second Field of EMP_SKILLS (the part between the first and the second semicolon) to the second output-link with the constraint that this second field is not empty (or that the number of semicolons >= 1)
etc.
After the transformer you can use a Funnel-stage to merge all the results from the several output-links.
- Make as many outpulinks as the maximum number of 'columns' you need to extract from the field EMP_FIELDS.
- Output the necessary columns and the first Field of EMP_SKILLS (the part before the first semicolon) to the first output-link
- Output the necessary columns and the second Field of EMP_SKILLS (the part between the first and the second semicolon) to the second output-link with the constraint that this second field is not empty (or that the number of semicolons >= 1)
etc.
After the transformer you can use a Funnel-stage to merge all the results from the several output-links.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi all,
Thanks for all your suggestions.I have found one more way which is easy for my case.
Source file--->Temp table(oracle enterprise stage)===>job 1
Temp Table(oracle enterprise stage)--->target file==>job 2
In the job 2 oracle enterprise stage i have used 'CONNECT BY LEVEL' oracle concept to make the data in the desired format.
Best regards,
Raja R P
Thanks for all your suggestions.I have found one more way which is easy for my case.
Source file--->Temp table(oracle enterprise stage)===>job 1
Temp Table(oracle enterprise stage)--->target file==>job 2
In the job 2 oracle enterprise stage i have used 'CONNECT BY LEVEL' oracle concept to make the data in the desired format.
Best regards,
Raja R P
Hi,
The solution which i have got using 'CONNECT BY LEVEL' was working fine when i executed the job with sample records(30 records totally).
But yesterday i tried running the job with full load (170,000 records).In toad the query which i have written is taking only 3 minutes and 58 secs to run.But the same query was running for 4 hrs in an oracle Enterprise stage without any improvement( i had to abort the parallel job).Please find my query which am using
I want to know why a query which istaking 4 minutes in toad is taking more than 4 hrs in datastage's oracle enterprise stage.
Please somebody throw some light on this.
Regards,
Raja R P
The solution which i have got using 'CONNECT BY LEVEL' was working fine when i executed the job with sample records(30 records totally).
But yesterday i tried running the job with full load (170,000 records).In toad the query which i have written is taking only 3 minutes and 58 secs to run.But the same query was running for 4 hrs in an oracle Enterprise stage without any improvement( i had to abort the parallel job).Please find my query which am using
Code: Select all
SELECT
CU_ID,
BSNS_FUNC_CD,
CLASS_CODE,
CHARACTERISTIC_NAME,
TRIM(
SUBSTR(CHAR_VAL,INSTR(CHAR_VAL,';',1,LEVEL) +1,INSTR(CHAR_VAL,';', 1,LEVEL+1)-INSTR(CHAR_VAL,';',1,LEVEL) -1)
) AS CHARACTERISTIC_VALUE,
MASTER_CLASS_FLAG,
CHAR_VAL_ORIG,
LEVEL
FROM
(
SELECT
TRIM(CU_ID) AS CU_ID,
TRIM(BSNS_FUNC_CD) AS BSNS_FUNC_CD,
TRIM(CLASS_CODE) AS CLASS_CODE,
TRIM(CHARACTERISTIC_NAME) AS CHARACTERISTIC_NAME,
';'||TRIM(CHARACTERISTIC_VALUE)||';' AS CHAR_VAL,
TRIM(MASTER_CLASS_FLAG) AS MASTER_CLASS_FLAG,
TRIM(CHARACTERISTIC_VALUE) AS CHAR_VAL_ORIG
FROM DEV.TABLE_CUID
)
CONNECT BY LEVEL <= LENGTH(CHAR_VAL)-LENGTH(REPLACE(CHAR_VAL,';',''))-1
Please somebody throw some light on this.
Regards,
Raja R P
Toad <> DataStage. While a query is a query they execute in quite different environments between the two tools. Toad is all about returning you the first few rows while OE will need to process all. What you should be worried about is the explain plan which is not something anyone here can check for you. Make sure the query is as optimal as it can be based on your table structures, volumes, stats, etc. If you need some help with that, involve your DBA.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers