compare records in file with rows in DB
Moderators: chulett, rschirm, roy
compare records in file with rows in DB
I have a complex flat file that starts with a header that includes the number of records in the file. I want to use this information to check both if the number of records in the file matches the header and when readeing these records into a database, see if I get all records i.e if the number of rows mathes the number of records in the header.
If any of these conditions fail I want to abort and alert. Is this possible? Is there any "smart" way of doing it?
I'm something of a newbie using datastage so please try to explain without using to many strange two letters
Best Regards
/Elin
If any of these conditions fail I want to abort and alert. Is this possible? Is there any "smart" way of doing it?
I'm something of a newbie using datastage so please try to explain without using to many strange two letters
Best Regards
/Elin
There are several ways of doing this. You can use ExecuteCommand activity to read the Header information, and pass it as an parameter to the job which read the Database. The job can read the database and count using Aggregator or simple count using using Database function and pass it to transformer.
In transformer, you can pass the row if Input parameter <> count of database records. And you can use "Abort after n rows" option to abort the job.
For file check, you can do the same or you can directly do (wc -l <filename>)-1 and compare with the header.
In transformer, you can pass the row if Input parameter <> count of database records. And you can use "Abort after n rows" option to abort the job.
For file check, you can do the same or you can directly do (wc -l <filename>)-1 and compare with the header.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ExecuteCommand that available in JobSequence can be used to run OS commands. You can either call any shell script, or directly run simple comands like this.
Under the ExecComand Tab, you'll find two options.
1. Command - The actual command that you going to use.
2. Parameters - The list of parameters that need to be passed to the command.
For geting use to the stage, just type and check the log. You ll find the command output.
Then try with parameteres passes for file name.
Command will be and the file name parameter will be placed in Parameters options as #Filename#
Try this, go thorugh the given manual, do a search. And get back if you have any issues. And get back with the logic that you have chosed to follow.[/code]
Under the ExecComand Tab, you'll find two options.
1. Command - The actual command that you going to use.
2. Parameters - The list of parameters that need to be passed to the command.
For geting use to the stage, just type
Code: Select all
wc -l {filename with fully qualified path}
Then try with parameteres passes for file name.
Command will be
Code: Select all
wc -l
Try this, go thorugh the given manual, do a search. And get back if you have any issues. And get back with the logic that you have chosed to follow.[/code]
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
My tought was to first read the first line from the file using dataStage, and load it into a new file that I then sould use to compare with the whole set. Is this a bad option, the reason is that I have no idea how to read the file using a command. How do I write the command file? What programming language and so on?
Thanks again!
Thanks again!
Sorry, just now I noticed that you are on Windows. I was giving you unix commands. If you have MKS toolkit installed, you could even execute Unix commands.
For you case, the options that you have choosed will be simple to construct using Datastage.
If you wish to use ExecuteCommand activity for passing parameters by reading the value of file, you can use TYPE command in that. Its basically DOS commands.
For you case, the options that you have choosed will be simple to construct using Datastage.
If you wish to use ExecuteCommand activity for passing parameters by reading the value of file, you can use TYPE command in that. Its basically DOS commands.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
sounds good that my way is possible, but is it an ugly way of solving it? I want it to be as generic as possible. For the time being we are only to use the number of rows but the header contains a lot more information that could be useful.
The line I want to read looks something like this:
1010LA 200701012007013120070201000000233
and the number I want to use are the last ones 233 in this case. But also the other numbers contain information. My thougth is to put ever distinct information in each column and then continue with the rest of the file into another database.
The line I want to read looks something like this:
1010LA 200701012007013120070201000000233
and the number I want to use are the last ones 233 in this case. But also the other numbers contain information. My thougth is to put ever distinct information in each column and then continue with the rest of the file into another database.
You ll have a lay out for the header. Read the header using that layout and restrict just the first row through Transformer (@INROWNUM =1) and store it in a database or in a File with the field name given to it.
The other two job will read this file, along with that regular file or Table. It will compare the values of the total number of rows present in the file as well as value that present in the header file. IF both are same, you can proceed with you flow, else you can raise the abort signal.
The other two job will read this file, along with that regular file or Table. It will compare the values of the total number of rows present in the file as well as value that present in the header file. IF both are same, you can proceed with you flow, else you can raise the abort signal.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
I'm terribly sorry to bother you again but I just realaised that althoug I am using a windows client for running datastage. The server is actually a unix server and the batch program should therefore be for UNIX.
A quick question therefore, can't find it on google: How do I write the syntax for grepping a whole line, if the first letter on the line is a 1 ? If it is a 2 I want to skip the line.
Br
/Elin
A quick question therefore, can't find it on google: How do I write the syntax for grepping a whole line, if the first letter on the line is a 1 ? If it is a 2 I want to skip the line.
Br
/Elin
To fetch the first line alone in unix, use.
For you question, use
Code: Select all
head -1 <file name>
For you question, use
Code: Select all
grep "1" <filename>.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Note that, the given regular expression can handle only character basis and not a whole string. If you need to grep two or three character to gether, you can try,trobinson wrote:grep ^1 <filename>
grep ^[1][2][3] <filename>.
where it looks for the line which starts with 123
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'