Generate variable output records per input record

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
gjacob
Charter Member
Charter Member
Posts: 17
Joined: Fri Apr 23, 2004 12:51 pm

Generate variable output records per input record

Post by gjacob »

Is there any good examples of using basic to create records? I am totally lost.

My problem is to create multiple records based on the input record. Assume there are two columns, start date and end date. I need to create a row with the month and days used.

Code: Select all

INPUT
ref   start      end
---   --------   ---------
123   1/1/2003   3/10/2003
555   2/7/2003   2/18/2003

OUTPUT
ref   month    used
---   ------   ----
123   200301   31
123   200302   28
123   200303   10
555   200302   12
I can handle the logic, and this would only take a few seconds to do in SAS. I just can't figure out how to do this in DataStage.

Thanks!
Greg
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

Why do you want to use basic to do this? Design a server job and use a SEQuential stage to read your INPUT and pass the data through a transformer to the OUTPUT SEQ stage. Within the transformer you can do whatever you want with the data. You would probably end up using Iconv/Oconv and Trimb.

but, if you insist on Basic, you would do something like (very basic):

OpenSeq blah_blah.txt to blah_blah
ReadSeq blah blah
detail
WriteSeq blah blah
CloseSeq blah_blah

Inbetween you would define the layout and still do Iconv/Oconv. This is very much an outline of what would have to be done. the devil is in the detail - like the layout, a few if then elses (like checking if file exists). The Help explains the Open/Read/Write. It should put you on the right track. good luck
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

Greg

I think you are correct, this is a good solution for BASIC. How would you do this in SAS?

Another solution is to create a table of days or use a day dimension table. Join this input table to the new day table and count how many records you get by month.
Mamu Kim
denzilsyb
Participant
Posts: 186
Joined: Mon Sep 22, 2003 7:38 am
Location: South Africa
Contact:

Post by denzilsyb »

kduke wrote: Another solution is to create a table of days or use a day dimension table. Join this input table to the new day table and count how many records you get by month.
thats a good idea. i just looked at the output again - its not that simple for a server job design! phew!
dnzl
"what the thinker thinks, the prover proves" - Robert Anton Wilson
gjacob
Charter Member
Charter Member
Posts: 17
Joined: Fri Apr 23, 2004 12:51 pm

Post by gjacob »

Thanks for the suggestions. I was able to get it to work.

I wrote a routine that would write to a hash file created/cleared beforehand. I had some trouble getting the routine calls to work, but the forum helped to explain the catalog.

Code: Select all

Deffun Year(date) Calling "DSU.Year"
Deffun Month(date) Calling "DSU.Month"
Deffun Day(date) Calling "DSU.Day"
Deffun DaysInMonth(date) Calling "DSU.DaysInMonth"

OpenPath "/dsdata/hash/junk/GregHashFile" to GregHash Else
   Stop
End

NextDate = StartDate
RowID = 0
Row = ""

Loop While (EndDate >= NextDate) Do
   If (Month(EndDate) <> Month(NextDate)) Then
      DayCount = DaysInMonth(NextDate) - Day(NextDate) + 1
   End Else
      DayCount = Day(EndDate) - Day(NextDate) + 1
   End

   RowID += 1
   Row<1> = RefId
   Row<2> = Year(NextDate) : Fmt(Month(NextDate), "R%2")
   Row<3> = DayCount

   Write Row On GregHash, ((RefId * 1000) + RowID) Else
      Print "Unable to write"
   End

   Ans = RowID

   NextDate = NextDate + DayCount
Repeat

Close GregHash
As you can see, I calculate the days used in the month, and if necessary, increment the start date to the next month and repeat the process.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

NEVER use a STOP statement in a Routine.

DataStage routines must always return, so that the "cleanup" can be managed gracefully even though something has gone wrong.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bchau
Charter Member
Charter Member
Posts: 46
Joined: Tue Jan 18, 2005 7:39 am
Location: Japan
Contact:

Post by bchau »

Is a subroutine the only way to go for generating variable output records per input record?

Can this be done using stages only? I have a similar problem as the original poster. I have two columns:

Code: Select all

Company      Count
ABC          100
DDD          50
So I would like to make 100 records of ABC and 50 of DDD. The Count is variable.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's actually quite a different problem, but solved fairly easily using an intermediate text file and generating the same number of line terminators as there are in your count column.
You need to be careful to specify no formatting in the Inputs tab for the Sequential File stage; when you read from the text file, the terminator characters that you wrote will become extra lines.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bchau
Charter Member
Charter Member
Posts: 46
Joined: Tue Jan 18, 2005 7:39 am
Location: Japan
Contact:

Post by bchau »

ray.wurlod wrote:intermediate text file and generating the same number of line terminators as there are in your count column.
Can I use a transformer to read the two fields and write to 1 field with the line terminators? I'm stuck on how to write the number of terminators based on the count field. I don't recall any for loops can be done in the transformer.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, you use a Transformer, but almost certainly use a Routine to generate the single output column based on the input columns, which are arguments to the routines.

If you just wanted empty rows you could use Str(Stuff:Char(10),Counter) but for anything much more complex a Routine is indicated. The return value from that routine becomes the derived value for the output column.
Last edited by ray.wurlod on Wed Apr 06, 2005 12:03 am, edited 2 times in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Code: Select all

FUNCTION GenerateOutputLine(Arg1, Arg2, Arg3, Arg4, Counter)
* Generates as many tab-delimited output lines as Counter indicates

EQUATE TAB To Char(9)
If System(91)
Then
   Terminator = Char(10):Char(13)  ; * DOS
End
Else
   Terminator = Char(10)  ; * UNIX
End

If Num(Counter)
Then

   Ans = ""
   For LineNo = 1 To Counter
      Line = Arg1 : TAB : Arg2 : TAB : Arg3 : TAB : Arg4 : Terminator
      Ans := Line
   Next LineNo

End
Else

   Ans = @NULL
   Call DSLogWarn("Non-numeric Counter argument", "GenerateOutputLine")

End

RETURN(Ans)
Last edited by ray.wurlod on Wed Apr 06, 2005 12:05 am, edited 2 times in total.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
bchau
Charter Member
Charter Member
Posts: 46
Joined: Tue Jan 18, 2005 7:39 am
Location: Japan
Contact:

Post by bchau »

Oh well, I was hoping to do this in the transformer without calling a routine.

I'll try the str(char(13),Count) technique. Maybe I can somehow insert the data after I've created the empty rows in another stage.

Thanks again Ray.
bchau
Charter Member
Charter Member
Posts: 46
Joined: Tue Jan 18, 2005 7:39 am
Location: Japan
Contact:

Post by bchau »

Ok I got it with the str(inputlink_field:char(13):char(10),Count):inputlink_field.
No Subroutine call, this is great!

It works perfectly, thanks again Ray for the guidance.
Post Reply