need unix script

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

need unix script

Post by bobbysridhar »

Hi,
I am loading rejected records into sequential file. Once the data is loaded, I want to seperate them based on timestamp. I have column for timestamp in rejected file.
Could anyone help me in writing unix script to partition the file based on timestamp after the datastage job is finished.

Thanks,
Sridhar
k.v.sreedhar
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

Need more info:

Please give an example of a reject record including the timestamp column.

How are you wanting to decide on the partitions? Based on timestamp range? If so, how are you deciding ranges? Based on row counts, etc.?
Choose a job you love, and you will never have to work a day in your life. - Confucius
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

Hi,
Thanks for your reply.
I want to partition it based on timestamp.
After Lookup, whatever records failed in lookup are going to a transformer using reject link. In transformer I am creating timestamp and sending it to reject file.
So, I want to partition the file based on that timestamp.
Because in datastage we don't have file rollover based on size.
we have to load the file and then after job finished only we can do partition using unix script.(as of my knowledge)
So, I want to partition the file based on that timestamp.

Thanks,
Sridhar
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Related to this post I assume:

viewtopic.php?p=419960

Not sure the timestamp is really going to help you chunk this up, why not look into something like the UNIX split command?
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

You still have to decide and explain what you mean by partitioning on the timestamp.

Example:

- All timestamps ending in an odd number go to file1, ending in an even number go to file2... (like modulus partitioning in DataStage).

- All timestamps from hour 1:00am to 2:00am go to file1, 2:00am to 3:00am go to file2, etc. (like range partitioning in DataStage).

Read my first reply again and give an example.
Choose a job you love, and you will never have to work a day in your life. - Confucius
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

Hi,
I am passing Jobstarttimestamp to reject file.
I want to split the file based on Jobstarttimestamp.
If job run now and then again run after few minutes I want to spilit them into two files, because the job run two times.
Means I want to spilit the file for each time job runs.
I have the DSjobstarttimestamp to keep track of it for each job run.
k.v.sreedhar
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

In that case, you don't really need to take the timestamp into consideration. You could call an after-job script to rename the reject file immediately after each job run completes. It could increment the file name's extension by a number or it could name the file so the file name contains a timestamp. In fact, you can do that within the job itself with no need for an after-job script or unix script.

Try including the DSJobStartTimestamp macro withing the reject file name:

Code: Select all

/path/reject_#DSJobStartTimestamp#.txt
Choose a job you love, and you will never have to work a day in your life. - Confucius
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

Hi,
Thanks for your reply.
Do we need to send DSjobstarttimestamp as parameter in the job to give it as file name to reject file.
Please explain me how the file name get the timestamp at runtime and what needs to be done to achieve this
k.v.sreedhar
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

Could somebody please provide me after job script to rename the file immediately after each job run.
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You don't need a script for that, just the 'move' command does a rename in UNIX. At its most basic:

mv <old_name> <new_name>

You'd have to be more specific as to exactly how you want it renamed to get more specific help. You could also pass in the 'timestamp' to use as a Job Parameter and then include it in the output filename of the Sequential File stage rather than renaming it after job.

What happened to splitting up the file? :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

thank you guys,
I was able to resolve by passing timestamp to file name.
For everyjob run it is creating newfile with updated timestamp.
I put the file name as \pathaname\#DSJobStartTimeStamp#

thanks again
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Are you sure you're ok with that? From what I recall, the output from that macro has spaces in it, which can confuse things when used raw like that in a filename. The colons can also cause issues... but if it works for you like that, then we good.
-craig

"You can never have too many knives" -- Logan Nine Fingers
bobbysridhar
Premium Member
Premium Member
Posts: 41
Joined: Sun Mar 09, 2008 8:12 pm

Post by bobbysridhar »

It will be great if we can able to remove places. Now I am not able to view data in unix environment but only able to view it from sequential file in dsjob after job run.
k.v.sreedhar
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You can view it in UNIX, you'd just need to enclose the filename (spaces and all) in single quotes to do so. In order to remove the spaces, you'd need something to retrieve the current timestamp before the job starts and then use a routine to remove everything but the numbers. A Sequence job could be leveraged to capture and format the timestamp and then pass it to the job as a parameter, then you'd use that parameter in the filename rather than the macro.

You could also write to a static filename and then rename it after job to include a timestamp and it's simple to build that one without any 'punctuation' in it. Sorry, but I don't have the syntax for that off the top of my head but I'm sure someone does. The only 'issue' with that approach is you must do all your file viewing / validation from UNIX as View Data from inside the job would never find the file it wrote to, seeing as how that name no longer exists.
-craig

"You can never have too many knives" -- Logan Nine Fingers
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

From UNIX script or after-job command or sequence job Execute Command stage, you can rename (move) the file using UNIX date command syntax. For example, to rename file.txt to a new file name formatted as file_YYYYMMDD.txt use the date command within tick marks (the date command is executed, and its output is substituted in place of tick marks):

Code: Select all

mv file.txt file_`date +%Y%m%d`.txt

file_20120405.txt
You can include time format options as well if you want to. Read the UNIX manual page for the date command. UNIX command line:

Code: Select all

man date
Choose a job you love, and you will never have to work a day in your life. - Confucius
Post Reply