Page 1 of 2

Data Manipulation in a Flat file

Posted: Wed Mar 08, 2006 12:28 pm
by narasimha
Is there a way we could convert a file without writing a Routine
From the way it looks

Code: Select all

222,333,4444		
235, 5644,453               
423,8576,764,353	
764,431,533 
to this

Code: Select all

222,333,4444$222
222,333,4444$333
222,333,4444$444
235,5644,453$235
235,5644,453$5644
235,5644,453$453
423,8576,764,353$423
423,8576,764,353$58576
423,8576,764,353$764
423,8576,764,353$353
764,431,533$764
764,431,533$431
764,431,533$533
The $ could be any delimiter other than ","

Re: Data Manipulation in a Flat file

Posted: Wed Mar 08, 2006 12:44 pm
by kwwilliams
Have you tried using a job to do it? Run your file through a transformer and have a transformation on the final column be col3:'|':col1.

Posted: Wed Mar 08, 2006 12:57 pm
by narasimha
Oops What is a job? :shock:
Have you tried using a job to do it?
If you have observed the input file it does not have 3 columns all the time,
Run your file through a transformer and have a transformation on the final column be col3:'|':col1.
I guess this wont work :roll:

Posted: Wed Mar 08, 2006 1:00 pm
by DSguru2B
what is the max number of columns you have? why i am asking is because, if i were you i would do as follows,
* have the input file going into a transformer
* have n number of links coming out of your transformer (n = num. of columns)
* inside the transformer apply your logic
eg: lets take your first row "colA = 222, colB = 333, colC=4444"
in the out put link(1st link), do this

Code: Select all

 colA = colA
              colB = colB
              colC = colC:"$":colA
in second link

Code: Select all

     colA = colA
              colB = colB
              colC = colC:"$":colB
and third link
[/code] colA = colA
colB = colB
colC = colC:"$":colC[/code]

*then have your output links collected by a link collector, into a seq. file.

thats one of the solutions, not the best i admit, but on of the ways you could go about achieving your goal
cheers.

Posted: Wed Mar 08, 2006 1:33 pm
by djm
An awk script is one way. e.g.

Code: Select all

awk -F, '
{
  for ( i = 1; ( i <= NF ); i++ )
  {
    print $0 "$" $i
  }
}' yourfile > new file
Testing the script

Code: Select all

cat |
awk -F, '
{
  for ( i = 1; ( i <= NF ); i++ )
  {
    print $0 "$" $i;
  }
}' << EOD
222,333,4444      
235, 5644,453
423,8576,764,353   
764,431,533
EOD
yields

Code: Select all

222,333,4444$222
222,333,4444$333
222,333,4444$4444
235, 5644,453$235
235, 5644,453$ 5644
235, 5644,453$453
423,8576,764,353$423
423,8576,764,353$8576
423,8576,764,353$764
423,8576,764,353$353
764,431,533$764
764,431,533$431
764,431,533$533
Hope this helps.

David

(previously known as "D" - thanks WebMaster)

Posted: Wed Mar 08, 2006 2:15 pm
by narasimha
Kool 8)
Again David it works like magic
I could make this into a script with accepts my file as a parameter and give me an output file.
But just curious to know a way to do this in wholly in DS

Posted: Wed Mar 08, 2006 4:05 pm
by kwwilliams
narasimha wrote:Oops What is a job? :shock:
Have you tried using a job to do it?
If you have observed the input file it does not have 3 columns all the time,
Run your file through a transformer and have a transformation on the final column be col3:'|':col1.
I guess this wont work :roll:
Sarchasm and insults will generate little future help :evil:

The amount of skill you have was unknown to me and the amount of details that you provide aid in the answer you will receive. A list of columns without stating that the number of columns is dynamic expects me to take the time to expect your data set instead of looking at the first three lines and making an assumption on what the rest of your data looks like.

Try to keep your posts polite. This is a resource that most of us do not get paid for and I really don't appreciate the tone of your post in response to my free but incorrect reply.

Posted: Wed Mar 08, 2006 4:14 pm
by narasimha
Keith
I am sorry if I have offended you
I thought your quote
Have you tried using a job to do it?
was a little sarcastic

PEACE :oops:

Posted: Wed Mar 08, 2006 4:20 pm
by kwwilliams
Wow that was a fast response. I appreciate and accept the apology.

Written form like this is really hard to interpret. I glanced at your post and quickly replied, so the brevity of the post may have led you to believe that it was sarchastic.

Posted: Wed Mar 08, 2006 4:34 pm
by kcbland
Use a Sequential stage with 1 column defined to read the entire row as a single column and then use a combination of FIELD and concatenation statements to assemble the changed row to output to another Sequential file again as a single column.

Posted: Wed Mar 08, 2006 6:24 pm
by narasimha
Ken,

Can we take this approach if we have dynamic number of values coming in? If yes can you please elaborate.

Posted: Wed Mar 08, 2006 7:38 pm
by kcbland
The whole row is a single text string value, there's only one column defined. Simple design.

Posted: Wed Mar 08, 2006 9:34 pm
by I_Server_Whale
As Ken said, you should grab(or more appropriately 'read') all the columns of the source as one column.

But it helps if you know the maximum no. of columns that the source can have. Calculate (add-up) the sizes of these columns. Now this result will be the length of the ONE column that you'll be using.

thanks,
Naveen.

Posted: Thu Mar 09, 2006 12:07 am
by ameyvaidya
The "DCOUNT" function will give the Index of the Last delimited Source Field to pick using "FIELD".

Posted: Thu Mar 09, 2006 1:36 am
by Sunshine2323
Hi,

I have achieved your requirement in the following way

Job Design:

Code: Select all

InputFile----->TransformerA----->InterMediateFile------->TransformerB----->OutputFile
InputFile:
I am reading the input data in a Single Column say Field1

TransformerA
Declaring 2 stage variables in TransformerA
StageVar=Count(DSLink2.Field1,',')
StageVar1=Str(DSLink2.Field1:char(13):char(10), StageVar+1)

Output Derivation for Intermediate File:
I am writing only one field to the InterMediateFile
Field1=Trim(StageVar1,char(10),"T")

TransformerB
Declaring 3 stage variables in TransformerA
StageVar=RowProcCompareWithPreviousValue(Field(DSLink3.Field1,',',1))
Initial Value of StageVar1=1
StageVar1=If StageVar=1 then StageVar1+1 else 1
StageVar2=Field(DSLink3.Field1,',', StageVar1)

Output Derivation for Output File:
Field1=DSLink3.Field1:'$': StageVar2

Testing with Input:

Code: Select all

222,333,4444
235, 5644,453
423,8576,764,353
764,431,533


IntermediateFile:

Code: Select all

222,333,4444
222,333,4444
222,333,4444
235, 5644,453
235, 5644,453
235, 5644,453
423,8576,764,353
423,8576,764,353
423,8576,764,353
423,8576,764,353
764,431,533
764,431,533
764,431,533
Output File:

Code: Select all

222,333,4444$222
222,333,4444$333
222,333,4444$4444
235, 5644,453$235
235, 5644,453$ 5644
235, 5644,453$453
423,8576,764,353$423
423,8576,764,353$8576
423,8576,764,353$764
423,8576,764,353$353
764,431,533$764
764,431,533$431
764,431,533$533
Hope this helps :)