Simple XML Read

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Simple XML Read

Post by dsdoubt »

Hi,

Iam finding difficulties in reading a simple XML data.
Let the following be the data.

Code: Select all

<?xml version="1.0" encoding="ISO-8859-1"?>
<EMP>
<Record sino="1" name="Mark" sal="100.00" />
</EMP>
The metadata got imported properly.

Job design is as follows.

Sequential file stage with Varchar 255 to read the XML.
Pass it to XML Input stage. Column content as XML_document. Output tab loaded with the requried table definiton.
And output to a sequential file with .txt format.

Any further settings to be made?
It gives out the following warning and error.

Code: Select all

XML_Input_20,0: Error: Cannot find mandatory property "xml_source_column".
XML_Input_20,0: Error occurred in call to ORPHCallActivePluginInitialize().
I found in one of previous post that, XML is tricky to use first time. Is there any one time setting as such, like we do for ODBC or other database??
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Ok, the above error is when, I try to mark of the Field (Sino) as Key. Else I get the following error.

Code: Select all

XML_Input_20,0: Error: No repetition element specified for link "DSLink22". Please make sure a column that contains the repetition path is set as key in the output link table definition.
Cannot find mandatory property "xml_source_column".
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Any inputs GentleMen?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Dump the Sequential stage and use an External Source stage to capture just the filename, then point the XML Input stage to that URL/File path. Ernie Ostic has a nice little write-up for this on his blog site.
Last edited by chulett on Fri Mar 28, 2008 5:13 pm, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
pavans
Participant
Posts: 116
Joined: Sun Sep 10, 2006 7:33 am
Location: bangalore, india

Post by pavans »

dsdoubt wrote:Any inputs GentleMen?
You can have a job design like:

External Source----XML Input------Sequential File/Data set.

Pass the XML File to the File Name field in External source.

Try this.

Good luck.
Thanks,
Pavan
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

dsdoubt wrote:Any inputs GentleMen?
Be patient, particularly on weekends. Remember that the rest of the world begins their weekend before those in the Americas do.

If it's any consolation, they begin their work week earlier too.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

I used External Source stage.
I passed the file name to the SourceProgram in External Source stage.
I also tried to pass ls <subdirectory>*.xml | sort as mentioned by Erine's blog, still iam getting the following warning.

Code: Select all

XML_Input_20,0: Error: No repetition element specified for link "DSLink22". Please make sure a column that contains the repetition path is set as key in the output link table definition.
Cannot find mandatory property "xml_source_column".
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You need to specify a repetition key - what happens when you do that?
-craig

"You can never have too many knives" -- Logan Nine Fingers
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Hello Craig,
As Iam new to XML, pls bear with me. Is that the Key option in the XML Input stage you talking about? As you could see in the input file, there is no keys as such. Still If I mark first or all the fields as key in the Stage, I get following error.

Code: Select all

XML_Input_20,0: Error: Cannot find mandatory property "xml_source_column".
XML_Input_20,0: Error occurred in call to ORPHCallActivePluginInitialize().
And by the way, is just the 'cat' that you use in ExternalSource stage to reaad the files or script?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You mark one field as the 'repetion key', not all - and that generally the 'deepest' so try 'sal'. And no, while you can cat the contents of the file, the suggestion was to simply use the stage to get the filename. Check Ernie's blog post on the subject:

http://dsrealtime.wordpress.com/2007/12 ... -a-source/
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

You're almost there...the repetition element issue can be confusing, but as Craig says, it' the "deepest" element that you find useful in the particular node that you are retrieving on a given link. You have all attributes that contain real data, so make the element itself the repetition element....

Here's how you might set up your meta data on the output link:


Record key=yes /EMP/Record/
sino /EMP/Record/@sino
name /EMP/Record/@name
sal /EMP/Record/@sal

later in the job you can lose the "Record" column as it probably won't be useful any longer except to get the repeats correctly retrieved.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

Hi Craig,
I was trying with ls as mentioned in my previous post. Since it was giving out error. I tried to cat it. Any way now I dont need to keep changing that from now.

Hi Erine,
I tried that too. Still it giving out the same error.
Cannot find mandatory property "xml_source_column".

Is there any settings in XML stage that I need to look for?

Erine, Iam just curios, if have any simple Jobs with XML stages that you might have created for your testing purpose, that you can share.[/i]
dsdoubt
Participant
Posts: 106
Joined: Sat Jul 15, 2006 12:17 am

Post by dsdoubt »

So far I never selected the SML Source column drop down in the Input stage of XML. Atlast it worked now!!!
Thanks a lot guys!!!
When I try to cat the file from ExternalSource stage and use "XML Document" option instead of "URL/File Path", I expect it to work.
Its reading all the lines from the input but doest give any output from XML stage. No warnings neither.
Is there a way to capture reject from XML? So that to check what could be the actual issue.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

All in but nothing out = bad / improper XPath Expressions.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply