Shell script

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Shell script

Post by Bilwakunj »

Hi,
I'm learning UNIX shell scripting .. so I'm wondering typically what kind of shell scripts are requires in PX. I guess as per the requirement it might differ but I'm trying to find out the typical examples so that I can practise on it. Please can anybody shed light on this . It will be really helpful for me .. Also is there any practise examples which I can try..or any site where I'll find the practise exercises..please let me know..


Thanks,
bilwakunj
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Think along the lines of relatively short scripts but using data manipulation languages and commands such as awk & sed.
Eric
Participant
Posts: 254
Joined: Mon Sep 29, 2003 4:35 am

Post by Eric »

You don't "need" any shell scripts for PX as shell scripts won't support the parallelism and thus performance that PX can offer.

Although saying that, you might want to use shell scripts to start jobs using "dsjob" that you can schedule using "cron".

Also start reading about the Px Stages called External Source and External Target.
dsxuserrio
Participant
Posts: 82
Joined: Thu Dec 02, 2004 10:27 pm
Location: INDIA

Post by dsxuserrio »

If you are learning UNIX and Datastage at the same time , dont mix them in your learning phase.
dsxuserrio

Kannan.N
Bangalore,INDIA
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Thanks

Post by Bilwakunj »

Thanks for ur help guys..
T
dsxuserrio wrote:If you are learning UNIX and Datastage at the same time , dont mix them in your learning phase.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

DataStage can run without any invocation of shell scripts by yourself. They do create their own internal shell scripts (particularly in EE -- take a look at what code is produced in the project folder). However, if you try to edit those files, Bad Things [tm] will happen.

There are situations where it is best to use a shell script. But for the beginner of DataStage, you usually will not encounter this situation. In fact, you should actively avoid requiring this situation to occurs.

If your requirements demands that, re-examine your requirements and determine if it is REALLY required in the form they are asking. Drill down to the actual results that is needed. Examine if the results are really vital, and then build from there.
Bilwakunj
Participant
Posts: 59
Joined: Fri Sep 10, 2004 7:00 am

Post by Bilwakunj »

Thanks a lot for this valuable information. But now I'm very curious to know why ppl say that if u r working on datastage u shd have a very sound knowledge of UNIX. If say I'm going to use job sequencer for my scheduling of job then i guess UNIX will not come into picture at all. Please correct me if I'm wrong.
Also is there something like , if u r working on server job u need lot of UNIX & if on PX u hardly need it.
And also does that mean that apart from 'cron' UNIX scripting hardly comes into picture.

T42 wrote:DataStage can run without any invocation of shell scripts by yourself. They do create their own internal shell scripts (particularly in EE -- take a look at what code is produced in the project folder). However, if you try to edit those files, Bad Things [tm] will happen.

There are situations where it is best to use a shell script. But for the beginner of DataStage, you usually will not encounter this situation. In fact, you should actively avoid requiring this situation to occurs.

If your requirements demands that, re-examine your requirements and determine if it is REALLY required in the form they are asking. Drill down to the actual results that is needed. Examine if the results are really vital, and then build from there.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

You can't work on a Unix system and not understand (and love) Unix. You must learn that Unix servers are NOT just a big PC. You must know how to monitor system load, see how things are running. The first thing I say to a young developer who comes to me to tell me their job/process/script is running slow today is "Have you checked the server load to see if something is hogging the resources?"

When you work on a large, multi-cpi, SMP, MPP, cluster, etc, you must know a lot about what's going on inside the hardware. You will write shell scripts all of the time, just to simply move and manipulate and manage files. You will write scripts just to archive or compress or clear out directories on a maintenance basis. It's not all point-and-click.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

... or you have have your UNIX administrator explain it to you. Forcefully!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vbeeram
Participant
Posts: 63
Joined: Fri Apr 09, 2004 9:40 pm
Contact:

Script

Post by vbeeram »

You dont require UNIX while learning Datastage,just to run and schedule you can use scripts.

Also some commands useful grep,sed and awk for String manipulations.
T42
Participant
Posts: 499
Joined: Thu Nov 11, 2004 6:45 pm

Post by T42 »

With DataStage Enterprise Edition, you are now responsible for one of the most powerful tools created by people who originally invented the Thinking Machines. That idea failed due to lack of software, so they went into software.

Ascential is now trying to reduce the amount of work you need to be able to do in order to achieve the maximum performance, but...

It's impossible, really.

Computers are not THAT smart. Building a program to make it smart will make it slower than it should be. DataStage EE is designed to be FAST.

So in order to make sure DataStage goes very very fast, it is very prudent that you understand the system underneath the engine. Would it be prudent to spend $100k of development time in order to squeeze out that last 5% of performance, or $50k for a new disk drive system that boost performance 50% due to faster disk access alone? Would an extra $25k in tuning on configuration files be prudent to add another 25% of performance boost?

You can only answer those questions when you actually understand the system you're dealing with.

That is precisely why highly experienced consultants are well sought after, even by Ascential themselves. The more you know, the more valuable you are in real dollars.

Do you want to be a well respected developer? Learn the system. DataStage EE does not REQUIRE you to really know the system for the most part, but... to expose its power... you gotta know what kind of powers you have available.
Post Reply