XML source File -> EXTRACT DATA

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Pikty
Participant
Posts: 13
Joined: Mon Nov 12, 2012 8:32 am

XML source File -> EXTRACT DATA

Post by Pikty »

Hi Experts
I need your help !!!
Currently working on DS 8.1 installed on a Unix box (server) / client running on Windows

In a parallele job I would like to load a XML file which uses the structure below:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" generated="2012-11-08T16:01:23">
	<T0041DIST>
		<Kantoor>499</Kantoor>
		<KantoorTaalregime>3</KantoorTaalregime>
		<Directie>2</Directie>
	</T0041DIST>
	<T0041DIST>
		<Kantoor>500</Kantoor>
		<KantoorTaalregime>2</KantoorTaalregime>
		<Directie>0</Directie>
	</T0041DIST>
	<T0041DIST>
		<Kantoor>501</Kantoor>
		<KantoorTaalregime>2</KantoorTaalregime>
		<Directie>3</Directie>
	</T0041DIST>
        .....
</dataroot>
I would like to extract all the data from the XML file and save it in a flat file/database...whatever as below

Code: Select all

Kantoor    KantoorTaalregime    Directie
499        3                    2
500        2                    0
501        2                    3
....
Currently I'm able to load the file if there is only a single node (<T0041DIST>) in the XML file as below (see the structure of the XML file below) using the following Stages:

Sequential File -> XML input -> Sequential file

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<dataroot xmlns:od="urn:schemas-microsoft-com:officedata" generated="2012-11-08T16:01:23">
	<T0041DIST>
		<Kantoor>499</Kantoor>
		<KantoorTaalregime>3</KantoorTaalregime>
		<Directie>2</Directie>
	</T0041DIST>
</dataroot>
BUT NOW how to read the file when the main node is repeated ? and extract all the data ???

Thanks in advance for your help
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

First off do not use a Sequential File stage to read your XML, use the External Source Stage as noted here. I'd also be curious if you marked one element as the 'repetition key' by noting it as a key column? If not try making one of the lowest level elements as the key, say "Directie". Once setup correctly it should all just happen automagically.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply