Hello Everyone!!
Actually we need to process large xml files( 10MB- 50 MB).
we get these files from MQ-Series.
but, as these files are very large to handle we are compressing it(gzip utility) and then segmenting it through MQ-Series and then retreiving through datastage.
But, in datastage when iam using the GUNZip utility it doesn't recongnise the format..
to make sure that it is not adding any additional information, i just tried segmenting plain xml file(without compressing) and passed it to datastage.
But, when i am trying to process this xml it doesn't formed well.
the reason is, the first segment contains half of the tag (something like '<sapven' ) and the second segment contains the other half ofthe tag ( 'dorNum>' ) and those are different lines
so, how can i hanlde or trick this in datastage..(as MQ experts says, its not possible to control the way it segments the msg in MQ)
handling segmented xml data.
Moderators: chulett, rschirm, roy
let me put my questn this way!!
say my file is somethign like this...
<lostDemand>
<extractDate>2006-01-20</extractDate>
<dcLocation>D985</dcLo
cation>
<bnqCode>24766418</bnqCode>
<e
an>5014957128637</ean>
<orderQty>00120</orderQty>
<issueQty>00090</issueQty>
<numLines>0001</numLines>
<totalOrderQty>00120</totalOrderQty>
</lostDemand>
tricking the data like:
if trim(Data)[1] = '<' and (Count(trim(Data), ' <') <> Count(trim(Data), ' <'))
then
concatenate the next line to the present line
else
trim(Data)
how can i acheive this?
thanks
kalpna
say my file is somethign like this...
<lostDemand>
<extractDate>2006-01-20</extractDate>
<dcLocation>D985</dcLo
cation>
<bnqCode>24766418</bnqCode>
<e
an>5014957128637</ean>
<orderQty>00120</orderQty>
<issueQty>00090</issueQty>
<numLines>0001</numLines>
<totalOrderQty>00120</totalOrderQty>
</lostDemand>
tricking the data like:
if trim(Data)[1] = '<' and (Count(trim(Data), ' <') <> Count(trim(Data), ' <'))
then
concatenate the next line to the present line
else
trim(Data)
how can i acheive this?
thanks
kalpna
You are gzipping it and then 'segmenting'? If so, do you know when you have all the pieces? You'll need to put the segments back together and then gunzip the file before you can parse the XML in DataStage. And it seems like the 'putting back together' part could simply be concatenation done by your operating system...
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Not being all that familiar with MQ - if all segments are read as a single message, why segment? When you write out the results to a file, do you get one single file that contains all the segments in the proper order? ![Confused :?](./images/smilies/icon_confused.gif)
Is the problem the fact that there are extra 'new lines' in the file between segments? Could you not strip those out using sed so that you end up with one long record when all is said and done? Then you could process it like 'normal' in DataStage.
![Confused :?](./images/smilies/icon_confused.gif)
Is the problem the fact that there are extra 'new lines' in the file between segments? Could you not strip those out using sed so that you end up with one long record when all is said and done? Then you could process it like 'normal' in DataStage.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Because, MQ has a limit on the size of the message. we are segmenting the message to make it 3 or 4 messages. yes!. the file contains segments in the proper order.chulett wrote:Not being all that familiar with MQ - if all segments are read as a single message, why segment? When you write out the results to a file, do you get one single file that contains all the segments in the proper order?.
Craig! How can we strip the new line chars from file using sed?
Thanks in advance
kalpna
Talk to one of your UNIX or scripting gurus there and have them help you. It's a 'stream editor' that supports regular expressions so you would basically tell it to replace all line-feed characters with an empty string, ie remove them.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Hello Everyone!!
Anyone worked with segmentation before??
What are the properties do we need to set in the MQ Stage apart from the usual parameters?
Please refer to my 1st post..
Anyone tried this before?
i got rid of the new lines in the compressed message retreived from the MQ and tried to unzip it but, gzip utility does not identify it.
tried without removing the new lines but, didn't work..
any help would b greatly appreciated
Thanks
Kalpna
Anyone worked with segmentation before??
What are the properties do we need to set in the MQ Stage apart from the usual parameters?
Please refer to my 1st post..
Anyone tried this before?
i got rid of the new lines in the compressed message retreived from the MQ and tried to unzip it but, gzip utility does not identify it.
tried without removing the new lines but, didn't work..
any help would b greatly appreciated
Thanks
Kalpna