How to compile a DS JOB from Linux command line

kurics40 · Post by **kurics40** » Fri Feb 24, 2012 3:53 am

Hello All,

How can I compile a job from Linux command line?

Thanks,
Janos

qt_ky · Post by **qt_ky** » Fri Feb 24, 2012 6:01 am

Job compilation is normally initiated on the client tier (Windows) where Designer is installed. There is a command line for the client tier: dscc.exe. It's documented in the Designer Client Guide.

kurics40 · Post by **kurics40** » Fri Feb 24, 2012 7:25 am

You are right. It works where client is installed with the command of
dscc.exe /H hostname /U username /P password project_name /J jobname

My problem is about multiple instances increasing the workload on the server. If I recomplie that job it helps on it around 2-3 days.

That is why I think in an automatization of the "cure".
I hope there is a wayto do it in Linux .

qt_ky · Post by **qt_ky** » Fri Feb 24, 2012 9:29 am

Compilation must be initiated from Designer on the client tier (Windows only). I wouldn't mind being proven wrong.

In any case, manual or automated recompiling is not the correct approach.

Perhaps your job logs are growing too large, too fast, and you need to enable good auto-purge settings via Director.

You may also need to prevent warnings in the job log especially they handle large volumes and you're logging one or more warnings per row.

kurics40 · Post by **kurics40** » Thu Mar 01, 2012 4:17 pm

I think I ve found a way. Basic language can do that:
So I need to write a routine for that.

INCLUDE or INSERT of the datastage basic. Lets read the manual.
However I ve read that it should be avoided because the precompile is done on client tier....I will keep it in my mind when I test it.

It sounds a bit unusual or a kind of vulnerability/ dependency of a perfectly installed client tier.

What if I wanna recompile that job because of some urgency and I can access only the server?

I am not sure that it uses anything from a client tier (at least not all the time).
C++ compiler is installed on server side which is not needed for server jobs or for routines which are written in Basic.

Lets have a try and I will update you!

qt_ky · Post by **qt_ky** » Thu Mar 01, 2012 7:36 pm

Yes, have a try... Here are some pointers.

kurics40 wrote:I think I ve found a way. Basic language can do that:
So I need to write a routine for that.

Your original post had Job Type: parallel, so while you can compile BASIC code on the server itself, that's not going to buy you much of anything for parallel jobs.

kurics40 wrote:It sounds a bit unusual or a kind of vulnerability/ dependency of a perfectly installed client tier.

Pre-compilation on the client is not a vulnerability. You can consider the client tier as a dependency for development and compilation in development environments. After all, it can be known as "client/server" software.

kurics40 wrote:What if I wanna recompile that job because of some urgency and I can access only the server?

If you can only access the server, then you've got more urgent issues than recompiling a job! If this really is a concern for you, such as if you're clients and servers are on opposite sides of the world across a slow WAN, then you need to physically co-locate some clients in the data center with the servers.

Again, it sounds like there may be some unusual problem in your environment, and recompiling is not the correct approach to solve that problem. The purpose of recompiling is to update your executable with any changes made to the job design.

Have you heard of Director? I have to ask because there are people who simply don't know about it. Instead of resetting an aborted job, they think it must be recompiled. Unfortunately I have seen this situation many people.

kurics40 · Post by **kurics40** » Fri Mar 02, 2012 1:44 am

Yes I know Director. It is okay.

I absolutly agree with you that if somebody needs a solution like what i try to achieve that is the unusual.

My DS enviroument sequence jobs are logged before and after the main job. Each sequence job call a parallel job to insert the execution time, etc (meta data) into a database. After the execution also. (I inherited the enviroument like this from the begining. )

These steps are clonable I meant with multiple instance job you need only one what you can call with different invoke id and with different parameters. As many sequence jobs we have as many times these logging jobs are invoked (before_log, after_log, etc). More than 10 000 times. In my opinion their tasks are not more than print the "Hello world!" just with different parameters.

I monitor the server (linux os) and I realized that the "soft" multiple instances take the most resources especially in CPU usage (95% or higher). Not the data load or data transform consume the most resources.

The habit or temp solution of the local team is usually recompile those logger jobs to get rid of this phenomen. It solves the problem for 1-2 days.

I asked all people around of my ex-collagues to give me a hint. They said that they had seen this problem many times and no idea.

If somebody have a good workaround for this rather than write an automatic recompile method I d say hale-luja.

chulett · Post by **chulett** » Fri Mar 02, 2012 7:20 am

kurics40 wrote:My DS enviroument sequence jobs are logged before and after the main job. Each sequence job call a parallel job to insert the execution time, etc (meta data) into a database. After the execution also. (I inherited the enviroument like this from the begining. )

I understand you inherited this but a parallel job that just inserts metrics in a database before and after everything? What a waste when a Server job would do it far more efficiently.

kurics40 lastly wrote:The habit or temp solution of the local team is usually recompile those logger jobs to get rid of this phenomen. It solves the problem for 1-2 days.

So the whole point of this exercise is to remove all of the MI invocations from the Director listing?

qt_ky · Post by **qt_ky** » Fri Mar 02, 2012 2:36 pm

Like Craig said, you can stop the bleeding by switching said metrics jobs from parallel to server, reducing overhead, job startup time per call, and thereby reducing a lot of waste.

kurics40 · Post by **kurics40** » Sat Mar 03, 2012 3:42 pm

qt_ky wrote:Like Craig said, you can stop the bleeding by switching said metrics jobs from parallel to server, reducing overhead, job startup time per call, and thereby reducing a lot of waste.

Ahh! Thanks a lot! I couldn't read his answer because I haven't updated my premium status yet. Anyway I will because the workaround is getting quite interesting.

kurics40 · Post by **kurics40** » Wed Mar 28, 2012 11:57 am

I have an idea how to make a workaround to decrease the overwork of the log insert. A server routine which invoka an external sql command line.

What if I call this routin from a sequence job?
Would it need the same initalization time like in a parallel job case a simple one row insert?

Code: Select all

$INCLUDE DSINCLUDE JOBCONTROL.H

**************************************************
* Name: db2sql
* Type: Transform Routine
*
* Author: J: Kurics
* Created: 2012-03-22
*
* Arguments:
*       DSN      : (db2 database)
*       USER     : (db2 User)
*       PASSWORD : (db2 Password)
* 
* Description:
*
* Version:
*          0.1 - J. Kurics
*
*
**************************************************

      nAns = -1

      GET(ARG.)DSN 
      GET(ARG.)USER
      GET(ARG.)PASSWORD 
      GET(ARG.)SQL

      IF TRIM(DSN) = "" THEN STOP "Missing DSN" 
      IF TRIM(USER) = "" THEN STOP "Missing User ID" 
      IF TRIM(PASSWORD ) = "" THEN STOP "Missing User Password" 
      IF TRIM(SQL) = "" THEN STOP "Missing User Password" 


      * replace each " with \" because of echo shell command behaviour
      SQLToRun = Change(SQL, '"', '\"')

      If System(91) = 0 Then
         Shell = 'UNIX'
         Sep = '/'
         OtherSep = '\'
         LineSeparator = CHAR(10)
      End Else
         Shell = 'DOS'
         Sep = '\'
         OtherSep = '/'
         LineSeparator = CHAR(13):CHAR(10)
      End

      *SQLToRun = Change(SQLToRun, @FM, LineSeparator)

      	  
      cCommand = "db2 connect to " : DSN : " user " : USER : " using " : PASSWORD : " " : LineSeparator 	  
      cCommand := " echo " : '"' : LineSeparator
      cCommand := SQLToRun : LineSeparator
      cCommand := "" : LineSeparator
      cCommand := '"' : "| db2"

 
      Call DSLogInfo("Start SQL: ": SQLToRun, RoutineName)
      Call DSExecute(Shell, cCommand, cOutput, nSystemReturnCode)
      Call DSLogInfo(cOutput,cCommand)

      If nSystemReturnCode = 0 Then
         Call DSLogInfo("Command finished sucessfully.", RoutineName)
         nAns = 0
      End
      Else
         Call DSLogWarn("Command finished with error code: " : nSystemReturnCode, RoutineName)
         nAns = Abs(nSystemReturnCode) * -1
      End

Ans = nAns

aartlett · Post by **aartlett** » Wed Mar 28, 2012 4:28 pm

Metric calculations and logging need to be done, but do they need to be done NOW? We write logging request data to a flat file and calculate it later. We prime the records with a server job to do the insert and update (start and end) in the sequence, but not the row calcs etc. they are done later from the file info.

I wouldn't even use a server job except we are using SQL SERVER and we have DS on a SUN box so I can't command line it like I used to for Oracle and DB2 on other gigs.

A simple, callable UNIX Shell script that you pass parameters to is very quick. So is the server job, it's just the overheads that kill you. not as bad as using a PX job for it though, especially if you haven't downgraded the APT Config for the job.

qt_ky · Post by **qt_ky** » Thu Apr 05, 2012 9:05 am

qt_ky wrote:Compilation must be initiated from Designer on the client tier (Windows only). I wouldn't mind being proven wrong.

FYI:

Server side command line compile has been requested (future release):

JR42097: REQUEST THE ABILITY TO COMPILE DATASTAGE JOBS ON THE SERVER SIDE(NON-WINDOWS ENVIRONMENT)

http://www-01.ibm.com/support/docview.w ... wg1JR42097