Page 1 of 1
Converting one row into multiple rows
Posted: Tue Mar 11, 2008 8:59 am
by casedwgroup
It seems to me like this is probably a common problem, so if this has already been discussed here before, please point me in that direction. I spent some time looking around and couldn't find what I was looking for.
I'm working on populating some row level security tables that have 2 key columns: OPRID and EMPLID. Setting the table up for people who exist in the source system is relatively easy. However, I need to be able to add rows for the people on my team who do not exist in the source system. I have a list of IDs that I need to add a row for every EMPLID.
I couldn't figure out a way to do this simply with a pivot stage. My next thought was to just write a query with no join and add a distinct clause to it:
Code: Select all
SELECT DISTINCT B.OPRID,A.EMPLID,TO_CHAR(A.EFFDT, 'YYYY-MM-DD HH24:MI:SS')
FROM TBL_A A, TBL_B B
This works fine for small amounts of data, but TBL_A has over 1 million rows. (TBL_B probably won't grow much larger than 10 rows.) This is obviously a terrible query for large amounts of data, and I need a better solution. Any suggestions would be greatly appreciated.
Re: Converting one row into multiple rows
Posted: Tue Mar 11, 2008 9:48 am
by gateleys
casedwgroup wrote:I spent some time looking around and couldn't find what I was looking for.
Really!! How about this? Its actually in the FAQ.
viewtopic.php?t=88639
Posted: Tue Mar 11, 2008 9:49 am
by WoMaWil
What about
Code: Select all
SELECT B.OPRID as MY_ID FROM TABLE_B as B
UNION
SELECT A.EMPLID as MY_ID FROM TABLE_A as A
?
Re: Converting one row into multiple rows
Posted: Tue Mar 11, 2008 10:37 am
by casedwgroup
If I understand correctly (and it's entirely possible that I don't), that post was about
splitting one row into multiple rows. I'm trying to do something more along the lines of repeating rows and just changing a key column. Let me give a data example of what I am trying to do, where TBL_C is the target table:
Code: Select all
*TBL_A
OPRID EMPLID EFFDT
----- ------ -----
ABC 1 01/01
ABC 2 01/01
XYZ 1 01/01
XYZ 2 01/01
XYZ 3 03/01
*TBL_B
OPRID
-----
JKL
MNO
*TBL_C
OPRID EMPLID EFFDT
----- ------ -----
ABC 1 01/01
ABC 2 01/01
XYZ 1 01/01
XYZ 2 01/01
XYZ 3 03/01
JKL 1 01/01
JKL 2 01/01
JKL 3 03/01
MNO 1 01/01
MNO 2 01/01
MNO 3 03/01
Hopefully that makes more sense. If there is any additional detail I need to provide, let me know.
Re: Converting one row into multiple rows
Posted: Tue Mar 11, 2008 11:57 am
by gateleys
It seems that you want to do a Cartesian join between TBL_B and TBL_A. Assuming that the EFFDT is dependent on EMPLID alone (that is, OPRID has no effect on it, whatsoever), you need to do the following -
1. Create a hashed file with the columns, DummyID, EMPLID and EFFDT, with keys on the first 2 ID columns.
2. Load TBL_A into this hashed file, with the value 1 assigned to DummyID. This eliminates all duplicates of EMPLID.
3. Load this hashed file into a sequential file, say File_1, with DummyID (only) as the key.
4. Load TBL_B into a sequential file, say File_2, with columns DummyID2 and OPRID2. Just as in step 2, assign a value 1 to DummyID2.
5. Use a Merge Stage with these 2 sequential files as input, and with Join Type set to INNER JOIN. In the Mapping tab, make sure you specify the key as DummyID for File_1 and DummyID2 for File_2 so that a Cartesian Join will be performed since all key values are 1.
5. The end result is you have a file -
Code: Select all
DummyID EMPID EFFDT DummyID2 OPRID2
------ ---- ----- ------- -------
1 1 01/01 1 JKL
1 2 01/01 1 JKL
1 3 03/01 1 JKL
1 1 01/01 1 MNO
1 2 01/01 1 MNO
1 3 03/01 1 MNO
You can now, extract the columns OPRID2, EMPID and EFFDT into another file and append it to your TBL_A rows. Voila!!
Posted: Tue Mar 11, 2008 12:13 pm
by casedwgroup
I'll try that out and get back to you. Thanks for the suggestion!
Posted: Tue Mar 11, 2008 1:37 pm
by casedwgroup
That appears to have worked. Thank you very much for the assistance!
Posted: Tue Mar 11, 2008 1:41 pm
by gateleys
casedwgroup wrote:That appears to have worked. Thank you very much for the assistance!
Great!
Now, can you share your performance stats with us, with over a million rows in your TAB_A performing a cross join with TAB_B?
Posted: Tue Mar 11, 2008 1:55 pm
by casedwgroup
Not all of the rows from table A are joined to table B -- just the unique ones, which is less than 12,000. So the performance is pretty good. It took less than 3 minutes to read the 1.2 million rows from the database and write to a hashed file, then a couple seconds to read 12,000 rows from the hashed file and write to a sequential file. The merging of that sequential file with another one with 4 rows in it, and then appending the results to the database, took 13 seconds. Assuming A and B are already loaded, the whole process took less than 4 minutes. The end result only added 48,000 rows though.
Posted: Tue Mar 11, 2008 2:01 pm
by gateleys
Oh! I forgot about the duplicates. Of course.
![Wink :wink:](./images/smilies/icon_wink.gif)