Page 1 of 1

File check

Posted: Tue Apr 12, 2011 6:45 am
by prasad v
Hi

We have couple of files which are being used in Datastage as sources. Before starting the datastage jobs. we need to test the file whether they have specific record like 2 records starts with 1 for one record, 2 for second record. Can we do this in Datastage or else go for unix script. I thought it is good to go for Unix script. But not sure how to do that.

Can any one help out on this

Posted: Tue Apr 12, 2011 11:48 am
by samsuf2002
Could you please elaborate your requirement.

If you want to go with UNIX then you can call the UNIX command or script from before job sub-routine in PX job using ExecSH or in command activity stage in a sequence job.

Posted: Tue Apr 12, 2011 12:16 pm
by prasad v
Thanks for your reply.

Yeah, we can do that if we have script. My problem is with the Script only.

I am not much aware of unix. I am looking for something if somebody has already this type of script.

My file is having 3 types of records, record is identified by the first character of the record. its like
0prasad
1prasad1
2prasad2

Here i need to process the files in datastage, which are having 0 and 2 record.

For this i need to check the files before starting the datastage jobs.

Posted: Tue Apr 12, 2011 12:49 pm
by samsuf2002
You can use this syntax to get the first value in a line of a file.

Code: Select all

sed '2!d' testinput.dat | cut -c1
I am using 2 assuming first line is a column name.

You can use this in script.

Posted: Tue Apr 12, 2011 1:17 pm
by jwiles
You need to further clarify your requirement:

Do you need to process the file ONLY if it has both types '0' and '2' records? Or do you need to process only the type '0' and/or '2' records that may exist in the file?

Posted: Tue Apr 12, 2011 1:39 pm
by prasad v
Actually i have 4 files which are used by different process.

Files must be in with the Header, data and Trailer records. So here we need to check whether the files are having Header and Trailer before triggering the other process. If any of the file is not meeting the requirement then we will not start the other process.

My view in datastae:

Command Activity-->Job Activity-->Job Activity

If files are not meeting the req. then this sequence will get aborted. and should be a log file which should be created by unix script used in comand activity with the file name not meeting the req.

Posted: Tue Apr 12, 2011 3:11 pm
by DSguru2B
I would suggest you create your first job to filter out the three types of record in three jobs. Then pass a unix level test to see if the header and trailer exist. This can be achieved by a simple 'wc - l'.
If the output is > 0 branch to the execution of the rest of your jobs, else branch it out to the end.

Posted: Tue Apr 12, 2011 3:45 pm
by jwiles
So '0' is the header, '2' is the trailer? '0' is always the first record and '2' is always the last (when they are present)?

If this is the case, you can use the head and tail unix commands in a command activity to check the file.

Code: Select all

head -c 1 #filename#;tail -n 1 #filename#|head -c 1
will return a single line containing the first byte of the first record and the first byte of the last record. If both header and trailer are the the command will return "02" as the CommandOutput (no quotes). You can check the value with a Nested Condition stage in a sequencer.

filename is a job parameter for the sequence.