DataSource1
EmployeeID Name
1 xx
2 yy
DataSource2
EmplyeeID Name
1 xx
2 yy
3 zz
The join would return only the record for EmployeeId #3. Also to add fuel to the fire DataSource1 is a sequential file and DataSource2 is an ODBC stage. So far I've created a sequential file going into a Transformer and the ODBC going to a hashed file and into the same Transformer. I'm trying to make the Transformer handle the join logic but it's not working.
If anyone has any ideas I would really appreciate it.
Since this is a server job, you really should have posted it on the server job forum.
The most obvious approach would be to load the contents of the text file into a hashed file with two key columns and no non-key columns.
The job to isolate the differences then extracts rows from your database (ODBC) table, performs a lookup (in a Transformer stage) against that hashed file, and sends the original row out of the Transformer only if the lookup succeeds (that is, if the reference input link's NOTFOUND variable is true). You handle this using a constraint expression on the output link from the Transformer stage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Thanks chuck and ray. I'm really learning a lot here. I had the lookup going to the ODBC when it was suppose to go to the sequential file. I also added a hashed file to be created from the sequential file for faster lookups. I then added the correct NOTFOUND logic to the Transformer's contraints and now by looking at the rows moving from stage to stage it looks to be working better. But it's not perfect. I'm getting an error going from my FTP plugin (thats the sequential file) to my hashed file:
IncrementalOperations..HAOldOperations.IDENT2: Unbalanced or unescaped quote character
At row 12457, link "FTPOut", while processing column "DESCRIPTION"
Unbalanced or unescaped quote character
I've looked at the sequential file at row 12457 and found nothing strange. Then when I look at the data in the "View Data..." of the Hashed File stage Inputs tab I see that the data is not in the same order as the sequential file before the FTP plug in. I have the file name in the has stage set to the same for the input and output. Is this correct or am I suppose to point the input to the incoming FTP filename?
So my question is how do I know what record is causing the problem. Is it record 12457 of the sequential file or is it record 12457 of the re-ordered hash file?
Sorry about posting in the wrong place. I'll get it right next time. Mods feel free to move this post to the correct forum.
[/code]
Tell the ftp stage that you have one unknown or varchar column, and set your quote character and column delimiter character to 000.
Let's see what actually appears in the sequential file.
You could also telnet to the location of the source file and use vi (or head, tail, and od) to look at the questionable row. I still think it has an embedded quote in the description text.
using the debug with breakpoints and watches set up I was able to find the bad record. There was a quote (") in the middle of a column in the sequential file. Actually I found a few of them which are there because the file is comma delimited but has quotes around each column. This file is created using the Microsoft ODBC text driver and it places them by default around each column.
Has anyone encountered the handling of quotes incoming from an ODBC source stage. I have an ODBC connection that will create a sequential file in the end. But some CHAR fields have quotes inside of them and it's creating a file with quotes which then cannot be used as an input to a hashed file like below:
As you'll notice Matthews is in quotes but it's part of the entire second column, Name. Any way around Ascential thinking that the quotes within the column Name don't mean the field is ending. I thought Ascential would know that in a comma delimited file the column doesn't end until you hit the comma.
Make sure you use the "Get SQL Info" button on the ODBC stage so that it properly registers the information about the ODBC driver in use. You'll see the Quote character text box right next to it probably change values now that it knows more about the ODBC driver.
Kenneth Bland
Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Thanks for your input. Unfortunately I had already done that. I did it again anyways and it left the quote character as a double quote("). I still have the sequential file as I posted previously. It still has quotes within the fields.
to strip out the quotes. You still could run into problems with commas within strings. By the way, that second argument in the Trim() is single quote, double quote, single quote.
Good to hear from you again. Actually I did try the QUOTE = 000 but just like you said now my problem is comma's inside the fields. So it's a constant problem. Either with commas or double quotes. I didn't try the Trim function though, because I would like to keep the fields as they come from the ODBC. So if they have a double quote or a comma I want to keep those characters in the resulting sequential file. How can I accomplish this?
If you eliminate SEQ 2, the problem is gone. Otherwise, try changing the quote character in SEQ 2 to something other than double quote, perhaps a pipe.