How to measure job complexity?

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

How to measure job complexity?

Post by kduke »

I got an email today asking this question. I have been thinking about this a lot lately. One of the reports that comes out of DwNav is Design Stats. This counts jobs, links and columns by category. The assumption is that one developer is probably working on that category. It would be nice to compare this to last month and see how many things got added in a month. This will give some kind of idea of how many new objects were created. You need to rate these objects like jobs 1000, Links 300 and columns 50 or something like that. You might be able to rate it by the time it takes to create a job, a link and a column. User defined SQL may add another level of complexity. If a job has 10 links then it may double the complexity. 20 may quadruple. Because the more complex something is the more time it takes but not at the same rate. If you have 10 objects to build a thing then 100 may take 20 times longer instead of 10 times. It is not a one for one.

I would think that a developer could consistently develop the same number of jobs or objects or whatever you want to call them. The idea is as these tools do more and more of the work then we need to know the number of steps it took to build this thing whether you call it a job or whatever. They used to count the number of lines of code produced. Now is more and more obscure but there is a task associated with each step.

Tools like DataStage shield you from the underlying complexity to connect to a database and insert, update or bulk load but you need to understand these concepts or what is actually happening. The idea used to be that these tools would be the great equalizer. That a bad developer would create work at the same rate as a more intelligent one. But the oposite is happening. These tools let the developer set at a conceptual level and not worry about the details. Therefore the more complex you can think the faster you can get your data from the source to the target mentally and physically.

I thought about writing a book about this. I think it would be cool to teach people how to measure work and therefore separate the good developers from the bad. This may ignore quality. Some guys work always breaks while others never break. I noticed that the level of quality is measured differently in different companies. Say you are lead with 3 or 4 devlopers. Say you are dealing with finacial data then accuracy is more important than speed. Say you are Google and you measure clicks on ads to your customers. Then you need to process millions of rows. Can you tell who in the household is clicking by what they click and change it while they are still logged in. Like a child may click on Disney where an adult might click on Ford. This may double or triple their revenue. How do they know. They build a data warehouse with DataStage PX. The first may run for days but could still save them millions because month old data is fine. Their sales cycle is annual. Google's is instant by instant.

How do you measure your work? How do know you are faster or slower than the developer next to you. I had a guy ask me why I got paid more than he did. I asked him why he wanted to know. He said we both did the same thing, "DataStage". I just said "Oh really".

I just automated a process which built 80 jobs because they were straight table copies. It took me 6 hours to write and it ran in 40 seconds. How many of you have written 80 jobs in 6 hours? I did a similar thing at the last job. We snowfalked a dimension to speed up the ETL and MicroStrategy. One dimension became 12. I had the transformation rules in a table used to evaluate metadata. We had 4 jobs per table load including a sequence. I wrote a process to generate all 4 jobs in about 4 days. 3 out of the 4 jobs will compile straight up. The forth takes a few minutes to fix the constraints. It genrated all 48 jobs in a few minutes.

I was talking to a guy about the html documentation that we generate. It took him 3 days to document one job. The documentation in DwNav and in JobReport was better and could document the whole project in minutes or seconds. They were told not to use these tools because they wanted to bill the hours. I said if the customer finds out then they should be upset. It cost them a lot of money for a poor quality document but never knew how to measure the work being done. Most work is measured from a gut feeling. It feels like Kim does good work but I am not sure. It seems like Kim is organized but I am not sure. This is how most businesses run today. They have no idea if I am better or worse than the guy next to me. He seems to get more work done.

Hopefully the whole team is more productive when I am around. That is my goal. If I never explain how or why I do things the way I do then that is okay. I know the two Craigs from Denver do it the same way. Ohers get it besides my friends respect me. That is all I need to motivate me.

If you think outside the box and most people are in the box then you are by definition weird.
Mamu Kim
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Complexity serks. :lol:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Tell them Ray man.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Come along little brother, we need to get you back into your room where it's safe and warm and the bad mens won't get you...
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Craig, you talking to me or Ray?
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ray Man, the Little Brother from Down Under. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Rainman was the older brother.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I know... but it trips off the tongue a little nicer the other way 'round.
-craig

"You can never have too many knives" -- Logan Nine Fingers
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

:lol:
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Just to illustrate my point. As the number links in a job goes up the complexity goes up disproportionally.

Code: Select all

C  +                         *
o  +                         *
m  +                         *
p  +                        *
l  +                        *
e  +                       *
x  +                      *
i  +                     *
t  +                   *
y  +                *
   +         *
   +  *
   ++++++++++++++++++++++++++++++++
   0              50             100
   
   Number of Links in a job
The same is true for man hours to build these jobs. At some point a job cannot be built. It is too complex.

Code: Select all

M  +                         *
a  +                         *
n  +                         *
   +                        *
H  +                        *
o  +                       *
u  +                      *
r  +                     *
s  +                   *
   +                *
   +         *
   +  *
   ++++++++++++++++++++++++++++++++
   0              50             100
   
   Complexity
The problem is that some people equate complexity with intelligence. Let us say that 100 links is the limit meaning we cannot finish building and testing a job with 100 links. A wise developer may break this up into 10 jobs with 10 links each or 5 jobs with 20 links each. 10 jobs with 10 links may take 10 days to build. 5 jobs with 20 links each may take 20 days to build according to my theory.

Which of these developers has the higher IQ. Which of these got the most work done in the least amount of time.
Mamu Kim
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I believe true brilliance is to simplify a complex problem so anyone can understand it. Fake intelligence is to make something artificially complex. Some consultants make processes so complex that customers are afraid to get rid of them. Fear and manipulation are powerful tools used by consultants and management.

Companies which use fear to control employees are usually burned by these consultants because you reap what you sow or call it karma. Most religions have a phrase to discribe this type of justice.

I have a question. Why do so few share solutions on this forum? Very few post code or job designs. What are you afraid of? I spent lots of hours on EtlStats, DwNav, DsWebMon, GenHtml and more. Most of these are free or close to it. Some are not as polished as others. I wish the code was cleaner or more polished but this is all I had time for. Nobody has complained though. Close enough to be useful. I hope you all share more code or dsx files. We will all benefit from your generous attitude.

Many thanks to Dennis, Ray, Craig, Ken, Chuck and others for sharing their thoughts, ideas and code.
Mamu Kim
WoMaWil
Participant
Posts: 482
Joined: Thu Mar 13, 2003 7:17 am
Location: Amsterdam

Post by WoMaWil »

Hi Kim,

it will not be easy what you are going to start. You are right the more Stages/links/rows .... a job has the more complicate it is. But should it?

I've had often "complicate" jobs under my fingers (30 stages, 50 links....) and 5 minutes to run. I changed it to a 3 stage 2 link job and got same result in 30 seconds. So whom of us deserves the prize for the best code?

The problem to provide code is that in most cases you solve a special problem and for to demonstrate you need to make it more general. I think it is than a time an not a willing problem.

Kind regards
Wolfgang
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

How about eight levels of nested containers, which I once encountered? :shock:

It's more than complexity, though. We need to address maintainability (of course, since DWs are around for a long, long time) and we also need to address the question of what is optimal design. And that's a bit of a moving target; the optimal solution in version 7 is radically different from an optimal solution in version 4.

One of my key indicators of success in DS design is that a non-DataStage person (maybe a DBA, maybe an IT manager, perhaps even a project manager) can understand the "big picture" by visual inspection of the design alone. You can guess that I use annotations on the canvas more than most.

Interestingly, local or shared containers can be a great help in this (but with a meaningful name, not "MagicHappens"!).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
lshort
Premium Member
Premium Member
Posts: 139
Joined: Tue Oct 29, 2002 11:40 am
Location: Toronto

Post by lshort »

I have to agree with the simplicity arguement. I have seen many jobs with unecessary stages and links and processes. The same result could be achieved with much less.

Ive always thought that the most difficult thing to do and perhaps the most valuable ... is to take something inherently complex and make it simple.

I try create all my jobs in a way that as Ray suggested even an outsider upon visual inspection can tell you at least conceptually what it does.

I am reminded of something a heard a long time ago that has always stuck with me. In the movie "Philadelphia Story" Denzel Washington's character is cross examining an expert witness and says....

"Explain it to me as if I was in fourth grade"

Words to live by. :D
Lance Short
"infinite diversity in infinite combinations"
***
"The absence of evidence is not evidence of absence."
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Good. I think everybody got my point that more complex is not neccesarily better or more intelligent but there is something to be said for measuring work done by counting objects or tasks done. If you use a weighted average like jobs take more time to create than links and more time than columns because there is a heirchy there then you may have a way to measure actual ETL work done. If you can measure it then you can predict time involved to do something. To measure work also is a way to rate developers. All of this has to be weighed against quality and simplicity or ease of maintenence. Eight levels of a container is a measure of insanity.

If you can truly measure quanity and quality then you are on the way to measuring IQ. The true IQ should measure the amount of work done. Maybe WQ would be a better term. They have EQ now.

Is it possible to measure work? I am asking. If you can measure ETL work then why not Java or COBOL? Maybe even construction work. If you can isolate the tasks into some kind of weighted factor or object then it is measurable.

Can you out work your neighbor? Can you prove it? Are your designs more simple and therefore more elogant or are they overly complex? How do you know?

All of us has an opinion about the quanity and the quality of our work. How do you rank? Ray is the leader of this forum. Can he really out work us all too? I don't know. I am not sure I want to find out.
Mamu Kim
Post Reply