Merge 2 files into 1
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
Can the Merge stage take hashed files as inputs? I think you will need to stream them out into sequential files first.chulett wrote:Have you checked out the Merge stage?
If you can guarantee that one of the hashed files will contain all of the key values that are in the other, then you could stream that hashed file out into a Transformer that uses the other hashed file as a reference link to pick up the other values from it.
If you can't guarantee that one is a strict superset or equivalent set to the other, then you will need to either stream them both out to sequential files and use the Merge stage, or stream them each out using the other as a reference lookup and then through a Link Collector, then a Sort stage, and then do duplicate removal with key change logic in a transformer.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
-
- Premium Member
- Posts: 1044
- Joined: Wed Sep 29, 2004 3:30 am
- Location: Nottingham, UK
- Contact:
I seem to remember that there are two different "Merge" stages, make sure the documentation you read is for the Server version, not the PX or EE version. Server version specifies two input files (and is very annoying to use), PX version takes two stream input links. I assume you are creating Server jobs as you are posting in the Server forum.Jessie wrote:I'm new to data stage, is there any detailed documentation? the HELP doesn' thelp much
Also there's another gotcha with the Merge stage, which I've already documented in this forum. It doesn't treat its input files in the same way that the Sequential File stage does - Sequential File does Excel-style quote-doubling so the string A Rusty 6" Nail gets written out as "A Rusty 6"" Nail", whereas if this file is used as one of the inputs to the Merge stage it will remove the quotes after the 6 entirely. I think this can be fixed by removing the \ escape character in the Merge Stage dialog, but I'm not sure.
Phil Hibbs | Capgemini
Technical Consultant
Technical Consultant
test2..Sequential_File_2.IDENT1: DSD.StageRun Active stage starting, tracemode = 0.
test2..Merge_1: Stage Properties
> First File Path = [/home/tttt/S_al.txt]
> Second File Path = [/home/tttt/S_Med.txt]
> Working Directory = [/home/tttt/S_temp]
> Stage Trace Level = [1]
test2..Merge_1: Error opening first input file
test2..Merge_1: Link property retrieval error
Attempting to Cleanup after ABORT raised in stage test2..Merge_1
Job test2 aborted.
is there a better way to copy the log? I did it one by one.
thanks
test2..Merge_1: Stage Properties
> First File Path = [/home/tttt/S_al.txt]
> Second File Path = [/home/tttt/S_Med.txt]
> Working Directory = [/home/tttt/S_temp]
> Stage Trace Level = [1]
test2..Merge_1: Error opening first input file
test2..Merge_1: Link property retrieval error
Attempting to Cleanup after ABORT raised in stage test2..Merge_1
Job test2 aborted.
is there a better way to copy the log? I did it one by one.
thanks
Hi,
You can merge two has files into 1 files by converting one files into seq file and using transformer
Step :1
Convert the primary hash(file) file into Seq File
Step :2
Use transformer to merge the file
Seq file as input and hash files as ref
O/p file will be Hash file like you final file with K1,K2 C1,C2,C3,C4,C5,C6
K1,K2,C1,C2,C3 should be taken from Seq File
C4,C5,C6 should be taken from Hash file Ref link
Limitation
Incase if you more in the hash file which is not in the seq file(file1) that will not output..
Regards
Raj
You can merge two has files into 1 files by converting one files into seq file and using transformer
Step :1
Convert the primary hash(file) file into Seq File
Step :2
Use transformer to merge the file
Seq file as input and hash files as ref
O/p file will be Hash file like you final file with K1,K2 C1,C2,C3,C4,C5,C6
K1,K2,C1,C2,C3 should be taken from Seq File
C4,C5,C6 should be taken from Hash file Ref link
Limitation
Incase if you more in the hash file which is not in the seq file(file1) that will not output..
Regards
Raj
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Code: Select all
HashedFile2
|
|
V
HashedFile1 ---> Transfomer ---> Target
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 457
- Joined: Tue Sep 25, 2007 4:05 pm
Hello,Jessie wrote:what is it?ray.wurlod wrote:...Code: Select all
HashedFile2 | | V HashedFile1 ---> Transfomer ---> Target
I'm too new to Datastage,
What Ray Wurlod is suggesting is, if you have two hashed files and you want to merge them together, design your job in such a way:
I/P: Hash File 1
O/P: Target (whatever your target is)
Lookup: Hash file 2
Inside the transformer, get all your desired rows and put them into the output just like the way you wanted: Key1, Key2, 22, 33, 44, 55...
If the keys are the same and you want to join them, you can do so. Or else, may be you can create a dummy field (with a constant default value) in the lookup Hash file and inside the transformer, you can hard code the field with that value and get all the other fields into the output. There are a lot of ways to do it and I am sure there are a lot of other exotic solutions.
Hope this helps...
Vivek Gadwal
Experience is what you get when you didn't get what you wanted
Experience is what you get when you didn't get what you wanted