Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.
Moderators: chulett , rschirm , roy
anil411
Premium Member
Posts: 53 Joined: Thu Aug 11, 2005 8:34 am
Post
by anil411 » Thu Oct 29, 2015 8:34 am
We are reading below XML file as below. The Last Column(CondoProjectName) has a special char.
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<typ:UCDPReceiveAppraisalRequest xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:typ="http://receiveappraisal.company.com/schema/types">
<typ:CurrentFileSequenceNumber>1</typ:CurrentFileSequenceNumber>
<typ:LastFileSequenceNumber>0</typ:LastFileSequenceNumber>
<typ:DocumentFileIDRecordCount>1</typ:DocumentFileIDRecordCount>
<typ:SyntheticTestIndicator>true</typ:SyntheticTestIndicator>
<typ:RequestSubmitDateTime>2008-09-18T21:49:45</typ:RequestSubmitDateTime>
<typ:RequestBeginDateTime>2014-09- 18T19:18:33</typ:RequestBeginDateTime>
<typ:RequestEndDateTime>2006-08-19T13:27:14-04:00</typ:RequestEndDateTime>
<typ:VendorName>XYZ Support</typ:VendorName>
<typ:DocumentFiles>
<typ:DocumentFile>
<typ:DocumentFileID>FILEID-1</typ:DocumentFileID>
<typ:DocumentFileStatus>Successful</typ:DocumentFileStatus>
<typ:AppraisalRecordCount>1</typ:AppraisalRecordCount>
<typ:Appraisals>
<typ:Appraisal>
<typ:DocumentID>DOCID-1</typ:DocumentID>
<typ:DocumentType>2</typ:DocumentType>
<typ:SubmissionStatus>In Progress</typ:SubmissionStatus>
<typ:RawPropertyStreetAddress>1234 Any Street Drive</typ:RawPropertyStreetAddress>
<typ:RawPropertyUnitNumber>11</typ:RawPropertyUnitNumber>
<typ:RawPropertyCity>Vienna</typ:RawPropertyCity>
<typ:RawPropertyState>VA</typ:RawPropertyState>
<typ:RawPropertyZipCode>22102</typ:RawPropertyZipCode>
<typ:RawAppraiserLicenseNumber>string</typ:RawAppraiserLicenseNumber>
<typ:RawAppraiserStateNumber>string</typ:RawAppraiserStateNumber>
<typ:RawAppraiserCertificationNumber>string</typ:RawAppraiserCertificationNumber>
<typ:ScrubbedAppraiserLicenseNumber>string</typ:ScrubbedAppraiserLicenseNumber>
<typ:RawSupervisorAppraiserLicenseNumber>string</typ:RawSupervisorAppraiserLicenseNumber>
<typ:RawSupervisorCertificationNumber>string</typ:RawSupervisorCertificationNumber>
<typ:ScrubbedSupervisorAppraiserLicenseNumber>string</typ:ScrubbedSupervisorAppraiserLicenseNumber>
<typ:AppraisedValueOfSubjectProperty>1000.00000000000</typ:AppraisedValueOfSubjectProperty>
<typ:EffectiveDateOfAppraisal>2002-11-05-05:00</typ:EffectiveDateOfAppraisal>
<typ:AppraisalFormNumberType>Small Residential Income Property Appraisal Report</typ:AppraisalFormNumberType>
<typ:AssignmentType>Purchase</typ:AssignmentType>
<typ:AssignmentTypeOther>string</typ:AssignmentTypeOther>
<typ:PropertyRightsAppraised>Fee Simple</typ:PropertyRightsAppraised>
<typ:PropertyRightsAppraisedOther>string</typ:PropertyRightsAppraisedOther>
<typ:CondoProjectName>‘Foothills Addition</typ:CondoProjectName>
</typ:Appraisal>
</typ:Appraisals>
</typ:DocumentFile>
</typ:DocumentFiles>
</typ:UCDPReceiveAppraisalRequest>
After reading the XML input file, We are writing to a Dataset. The value in Output file for CondoProjectName is written as
"^ZFoothills Addition"
If i change encoding from UTF-8 to ISO-8859-1 , The Data matches but we receive the file from external agencies.
We want to continue using UTF-8 and Data should match between XML and
Dataset. Please let me know if anybody faced similar issue.
Appreciate your help
ray.wurlod
Participant
Posts: 54607 Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:
Post
by ray.wurlod » Thu Oct 29, 2015 8:49 am
Are you certain that there is no ^Z character in the source data? (Note, too, that this is the DOS end-of-file marker.)
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
anil411
Premium Member
Posts: 53 Joined: Thu Aug 11, 2005 8:34 am
Post
by anil411 » Thu Oct 29, 2015 11:38 am
Ray,
We don't have ^Z character in the source data.
The Source Data is as below.
<typ:CondoProjectName>‘Foothills Addition </typ:CondoProjectName>
Output in Dataset is as below.
"^ZFoothills Addition "
Please advise me.
ray.wurlod
Participant
Posts: 54607 Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:
Post
by ray.wurlod » Thu Oct 29, 2015 9:21 pm
Can you please advise what the actual (hex codes) values are for the first three source characters after the tag? ^Z is 0x1A (which may help you).
It appears that you are not using a compatible character map between source and target.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Posts: 43085 Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO
Post
by chulett » Thu Oct 29, 2015 9:35 pm
anil411 wrote: If i change encoding from UTF-8 to ISO-8859-1 , The Data matches but we receive the file from external agencies.
I don't understand the "but" here. Are you saying you change the encoding
in the job and it works? Or if you change the first element in the XML file?
-craig
"You can never have too many knives" -- Logan Nine Fingers
anil411
Premium Member
Posts: 53 Joined: Thu Aug 11, 2005 8:34 am
Post
by anil411 » Fri Oct 30, 2015 6:27 am
Chulett,
If encoding="UTF-8" in first line of XML, the data is having issue.
If encoding="ISO-8859-1" in first line of XML, The Data between Source
and Target are matching.
I can't change NLS Settings , as they are disabled in our Project.
Is there any function to resolve this issue.
Thank you,
chulett
Charter Member
Posts: 43085 Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO
Post
by chulett » Fri Oct 30, 2015 6:51 am
Then it seems to me you have two options. One is to ask the vendor to make the change. Second is to pre-process the file and change it yourself using something like awk / sed / perl. This could built in as a Before Job process or accomplished via a Sequence job.
-craig
"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Posts: 3838 Joined: Mon Oct 17, 2005 9:34 am
Post
by eostic » Mon Nov 02, 2015 6:51 am
I am fairly surprised that the data isn't being escaped as &#nn; where nn is the hex value for each of the bytes in question. That might be an option for you as you consider editing the overall file as Craig suggests.
Ernie
ds_developer
Premium Member
Posts: 224 Joined: Tue Sep 24, 2002 7:32 am
Location: Denver, CO USA
Post
by ds_developer » Thu Nov 12, 2015 10:05 am
What is the datatype (in DS) you are using for this field? I just did this without changing the encoding="UTF-8" designation by using the NVarChar datatype. No changes to the NLS settings either.