need unix script
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
need unix script
Hi,
I am loading rejected records into sequential file. Once the data is loaded, I want to seperate them based on timestamp. I have column for timestamp in rejected file.
Could anyone help me in writing unix script to partition the file based on timestamp after the datastage job is finished.
Thanks,
Sridhar
I am loading rejected records into sequential file. Once the data is loaded, I want to seperate them based on timestamp. I have column for timestamp in rejected file.
Could anyone help me in writing unix script to partition the file based on timestamp after the datastage job is finished.
Thanks,
Sridhar
k.v.sreedhar
Need more info:
Please give an example of a reject record including the timestamp column.
How are you wanting to decide on the partitions? Based on timestamp range? If so, how are you deciding ranges? Based on row counts, etc.?
Please give an example of a reject record including the timestamp column.
How are you wanting to decide on the partitions? Based on timestamp range? If so, how are you deciding ranges? Based on row counts, etc.?
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
Hi,
Thanks for your reply.
I want to partition it based on timestamp.
After Lookup, whatever records failed in lookup are going to a transformer using reject link. In transformer I am creating timestamp and sending it to reject file.
So, I want to partition the file based on that timestamp.
Because in datastage we don't have file rollover based on size.
we have to load the file and then after job finished only we can do partition using unix script.(as of my knowledge)
So, I want to partition the file based on that timestamp.
Thanks,
Sridhar
Thanks for your reply.
I want to partition it based on timestamp.
After Lookup, whatever records failed in lookup are going to a transformer using reject link. In transformer I am creating timestamp and sending it to reject file.
So, I want to partition the file based on that timestamp.
Because in datastage we don't have file rollover based on size.
we have to load the file and then after job finished only we can do partition using unix script.(as of my knowledge)
So, I want to partition the file based on that timestamp.
Thanks,
Sridhar
k.v.sreedhar
Related to this post I assume:
viewtopic.php?p=419960
Not sure the timestamp is really going to help you chunk this up, why not look into something like the UNIX split command?
viewtopic.php?p=419960
Not sure the timestamp is really going to help you chunk this up, why not look into something like the UNIX split command?
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
You still have to decide and explain what you mean by partitioning on the timestamp.
Example:
- All timestamps ending in an odd number go to file1, ending in an even number go to file2... (like modulus partitioning in DataStage).
- All timestamps from hour 1:00am to 2:00am go to file1, 2:00am to 3:00am go to file2, etc. (like range partitioning in DataStage).
Read my first reply again and give an example.
Example:
- All timestamps ending in an odd number go to file1, ending in an even number go to file2... (like modulus partitioning in DataStage).
- All timestamps from hour 1:00am to 2:00am go to file1, 2:00am to 3:00am go to file2, etc. (like range partitioning in DataStage).
Read my first reply again and give an example.
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
Hi,
I am passing Jobstarttimestamp to reject file.
I want to split the file based on Jobstarttimestamp.
If job run now and then again run after few minutes I want to spilit them into two files, because the job run two times.
Means I want to spilit the file for each time job runs.
I have the DSjobstarttimestamp to keep track of it for each job run.
I am passing Jobstarttimestamp to reject file.
I want to split the file based on Jobstarttimestamp.
If job run now and then again run after few minutes I want to spilit them into two files, because the job run two times.
Means I want to spilit the file for each time job runs.
I have the DSjobstarttimestamp to keep track of it for each job run.
k.v.sreedhar
In that case, you don't really need to take the timestamp into consideration. You could call an after-job script to rename the reject file immediately after each job run completes. It could increment the file name's extension by a number or it could name the file so the file name contains a timestamp. In fact, you can do that within the job itself with no need for an after-job script or unix script.
Try including the DSJobStartTimestamp macro withing the reject file name:
Try including the DSJobStartTimestamp macro withing the reject file name:
Code: Select all
/path/reject_#DSJobStartTimestamp#.txt
Choose a job you love, and you will never have to work a day in your life. - Confucius
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
You don't need a script for that, just the 'move' command does a rename in UNIX. At its most basic:
mv <old_name> <new_name>
You'd have to be more specific as to exactly how you want it renamed to get more specific help. You could also pass in the 'timestamp' to use as a Job Parameter and then include it in the output filename of the Sequential File stage rather than renaming it after job.
What happened to splitting up the file?![Confused :?](./images/smilies/icon_confused.gif)
mv <old_name> <new_name>
You'd have to be more specific as to exactly how you want it renamed to get more specific help. You could also pass in the 'timestamp' to use as a Job Parameter and then include it in the output filename of the Sequential File stage rather than renaming it after job.
What happened to splitting up the file?
![Confused :?](./images/smilies/icon_confused.gif)
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
Are you sure you're ok with that? From what I recall, the output from that macro has spaces in it, which can confuse things when used raw like that in a filename. The colons can also cause issues... but if it works for you like that, then we good.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 41
- Joined: Sun Mar 09, 2008 8:12 pm
You can view it in UNIX, you'd just need to enclose the filename (spaces and all) in single quotes to do so. In order to remove the spaces, you'd need something to retrieve the current timestamp before the job starts and then use a routine to remove everything but the numbers. A Sequence job could be leveraged to capture and format the timestamp and then pass it to the job as a parameter, then you'd use that parameter in the filename rather than the macro.
You could also write to a static filename and then rename it after job to include a timestamp and it's simple to build that one without any 'punctuation' in it. Sorry, but I don't have the syntax for that off the top of my head but I'm sure someone does. The only 'issue' with that approach is you must do all your file viewing / validation from UNIX as View Data from inside the job would never find the file it wrote to, seeing as how that name no longer exists.
You could also write to a static filename and then rename it after job to include a timestamp and it's simple to build that one without any 'punctuation' in it. Sorry, but I don't have the syntax for that off the top of my head but I'm sure someone does. The only 'issue' with that approach is you must do all your file viewing / validation from UNIX as View Data from inside the job would never find the file it wrote to, seeing as how that name no longer exists.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
From UNIX script or after-job command or sequence job Execute Command stage, you can rename (move) the file using UNIX date command syntax. For example, to rename file.txt to a new file name formatted as file_YYYYMMDD.txt use the date command within tick marks (the date command is executed, and its output is substituted in place of tick marks):
You can include time format options as well if you want to. Read the UNIX manual page for the date command. UNIX command line:
Code: Select all
mv file.txt file_`date +%Y%m%d`.txt
file_20120405.txt
Code: Select all
man date
Choose a job you love, and you will never have to work a day in your life. - Confucius