File check

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

File check

Post by prasad v »

Hi

We have couple of files which are being used in Datastage as sources. Before starting the datastage jobs. we need to test the file whether they have specific record like 2 records starts with 1 for one record, 2 for second record. Can we do this in Datastage or else go for unix script. I thought it is good to go for Unix script. But not sure how to do that.

Can any one help out on this
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

Post by samsuf2002 »

Could you please elaborate your requirement.

If you want to go with UNIX then you can call the UNIX command or script from before job sub-routine in PX job using ExecSH or in command activity stage in a sequence job.
hi sam here
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

Post by prasad v »

Thanks for your reply.

Yeah, we can do that if we have script. My problem is with the Script only.

I am not much aware of unix. I am looking for something if somebody has already this type of script.

My file is having 3 types of records, record is identified by the first character of the record. its like
0prasad
1prasad1
2prasad2

Here i need to process the files in datastage, which are having 0 and 2 record.

For this i need to check the files before starting the datastage jobs.
samsuf2002
Premium Member
Premium Member
Posts: 397
Joined: Wed Apr 12, 2006 2:28 pm
Location: Tennesse

Post by samsuf2002 »

You can use this syntax to get the first value in a line of a file.

Code: Select all

sed '2!d' testinput.dat | cut -c1
I am using 2 assuming first line is a column name.

You can use this in script.
hi sam here
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

You need to further clarify your requirement:

Do you need to process the file ONLY if it has both types '0' and '2' records? Or do you need to process only the type '0' and/or '2' records that may exist in the file?
- james wiles


All generalizations are false, including this one - Mark Twain.
prasad v
Participant
Posts: 174
Joined: Mon Mar 30, 2009 2:18 am

Post by prasad v »

Actually i have 4 files which are used by different process.

Files must be in with the Header, data and Trailer records. So here we need to check whether the files are having Header and Trailer before triggering the other process. If any of the file is not meeting the requirement then we will not start the other process.

My view in datastae:

Command Activity-->Job Activity-->Job Activity

If files are not meeting the req. then this sequence will get aborted. and should be a log file which should be created by unix script used in comand activity with the file name not meeting the req.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

I would suggest you create your first job to filter out the three types of record in three jobs. Then pass a unix level test to see if the header and trailer exist. This can be achieved by a simple 'wc - l'.
If the output is > 0 branch to the execution of the rest of your jobs, else branch it out to the end.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

So '0' is the header, '2' is the trailer? '0' is always the first record and '2' is always the last (when they are present)?

If this is the case, you can use the head and tail unix commands in a command activity to check the file.

Code: Select all

head -c 1 #filename#;tail -n 1 #filename#|head -c 1
will return a single line containing the first byte of the first record and the first byte of the last record. If both header and trailer are the the command will return "02" as the CommandOutput (no quotes). You can check the value with a Nested Condition stage in a sequencer.

filename is a job parameter for the sequence.
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply