New line character issue for sequential file stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

austin_316
Participant
Posts: 80
Joined: Fri Aug 21, 2009 7:49 am
Location: India

New line character issue for sequential file stage

Post by austin_316 »

Hi,
we are facing issue when we used a sequential file stage in on of our jobs.
what we actually doing in the job is we fetch data from a oracle stage and use remove duplicates stage and then store the data in the sequential file stage

Code: Select all

OracleStage1------------>Funnel----->RemoveDuplicates----->Copy----->SequentialFile
                         ^
                         | 
                         |
                         |
                 Oraclestage2
There are some columns in the table which have \n newline charater in the data. So when iam storing the value in sequential file it is taking the record as 2 records instead of 1.
Data in sequential file

Col1||Col2||Col3||Col4
rec1||abd||def||ghi
rec2||abc
asd||kol||llp
rec3||jghg||jghr||kjgng
We are storing this data in sequential file as it is an input for a jar file which we are calling in after job subroutine.
Can anyone please tell me how can we replace this newline character with come combination of characters like ';:;:;:'.
we dont have any transformer stage in the job and we wanted to do this without using the transformer stage.

any help is highly appreciated.

thanks.
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

One thing you can do is , in the source itself you can correct it by using the below query

Code: Select all

UPDATE table_name SET cl_name=REPLACE(cl_name,chr(10),'') where INSTR(cl_name,chr(10))>0
pandeeswaran
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

While writing to Sequential file use the below command at fileter option.

tr -d '\n'
Cheers
Ravi K
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Ravi.K wrote:While writing to Sequential file use the below command at fileter option.

tr -d '\n'
If you use this, all the records in the source table are loaded into the target as single record.
because '\n' between one record and another record also got removed.so everything comes in a single line
pandeeswaran
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

i want to remove the '\n' character in particular field and load into file.
How to achieve this in transformer stage?
i have tried Exchange() and tr -d in filter command.
It's not giving expected result.

Any other suggestions welcome!

Thanks
pandeeswaran
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

An old trick I learned from an assembler programmer that seems good regardless of the envirnoment:

1) Use some other special character for the end-of-record mark, one you are sure will not appear in the rest of the data.

2) Go ahead and remove all '\n' from every record.

3) Convert the end-of-record mark to '\n'.

Good luck.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

FranklinE wrote:An old trick I learned from an assembler programmer that seems good regardless of the envirnoment:

1) Use some other special character for the end-of-record mark, one you are
.
How to do this?
in which stage we need to do this?

Thanks
pandeeswaran
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

It depends on where the '\n' is causing you the trouble. Without seeing your code, I would expect one of two possibilities:

1) If you don't control this until the file stage, that is where you set the alternate value for end-of-record, in Format/Record level/Record delimiter string. After that, you use another job or a Unix script to strip the bad '\n' and finish with converting the end-of-record mark.

2) When you read from Oracle, use a transformer to remove the character. When you get to the file stage, the only place it should appear is at the end of each record because your file stage will put it there.

There are details that you have to figure out in between. Good luck.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

Thanks franklin.

I have planned to remove \n in source oracle stage itself by using replace(col_name,chr(10),'').
pandeeswaran
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Have you checked with your client whether they want to keep those newline characters? Or is yours a site with no data governance processes whatsoever?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
pandeesh
Premium Member
Premium Member
Posts: 1399
Joined: Sun Oct 24, 2010 5:15 am
Location: CHENNAI, TAMIL NADU

Post by pandeesh »

ray.wurlod wrote:Have you checked with your client whether they want to keep those newline characters? Or is yours a site with no data governance processes whatsoever?
Now a days, we are manually updating the particular column and then we are running the job.
it'll be fine if we handle this in source stage itself.
however there ll not be any impact in original table.
Just it ll write in a sequential file without \n
pandeeswaran
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Wait... suddenly this is your thread? You really need to stop doing that. :?
-craig

"You can never have too many knives" -- Logan Nine Fingers
austin_316
Participant
Posts: 80
Joined: Fri Aug 21, 2009 7:49 am
Location: India

New line character issue for sequential file stage

Post by austin_316 »

ray.wurlod wrote:Have you checked with your client whether they want to keep those newline characters? Or is yours a site with no data governance processes whatsoever?
We are actually having this problem. I want to convert those \n to some combination of characters when iam passing it to the sequential file and after the jar file is done working on it i want to convert those combination of characters back to \n. Also our problem is we are not sure in which columns this newline character might appear. so we want to do this conversion for all columns. any way to do this other than transformer? even in transformer i think i have to mention the convert() function for all the column derivations.
samyamkrishna
Premium Member
Premium Member
Posts: 258
Joined: Tue Jul 04, 2006 10:35 pm
Location: Toronto

Re: New line character issue for sequential file stage

Post by samyamkrishna »

Hi,

You can create a wrapper stage

having this tr "\n" "abc" and pass all the column through the wrapper.

Regards,
Samyam
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

That's not how tr works!
:cry:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply