Export -- xml vs .dsx format

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

nkreddy
Premium Member
Premium Member
Posts: 23
Joined: Mon Jun 21, 2004 7:12 am
Location: New York

Export -- xml vs .dsx format

Post by nkreddy »

Hello,

I was wondering what is the advantage in exporting the project as an xml format rather than the proprietary .dsx format. Any clarification would be appreciated.

Thanks
kduke
Charter Member
Charter Member
Posts: 5227
Joined: Thu May 29, 2003 9:47 am
Location: Dallas, TX
Contact:

Post by kduke »

I think you said it. One is availble to any person that knows xml. That gives some flexibility maybe using it to create metadata. I don't know of anyone who has done that yet.

I think that dsx never has a problem importing and I am just more comfortable using it. It has been around a lot longer.

Most of us only use these only to backup our jobs so I always use dsx. Until I become better at XML then I think I will stick to dsx.
Mamu Kim
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Ditto what Kim said. I use the dsx format for backups and import/export. On occasion, I'll pull a full xml export and stash it away somewhere on my UNIX server and use it to search for metadata, typically grepping for tablenames or routines when I want to find out which jobs they are used in.

The first time you take a large xml export and try to re-import it, you'll realize why you want to stick with .dsx for something like that. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ogmios
Participant
Posts: 659
Joined: Tue Mar 11, 2003 3:40 pm

Post by ogmios »

Just a little footnote: on the client side you have the application xml2dsx which will allow you to convert DataStage xml to dsx files.

If you want to do own processing on jobs the XML format is probably easier to handle in Perl/Java/... than the dsx format. For all the rest stay with dsx.

Ogmios
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

ogmios wrote:on the client side you have the application xml2dsx which will allow you to convert DataStage xml to dsx files.
Which is run automatically when you import an xml export. This step can be rather painful for your pc to run, hence my comment.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

DSX usually gives a smaller export file than XML.

Since I often have to email these, that's an important consideration for me, as some of my interlocutors use dial-up connections.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
clshore
Charter Member
Charter Member
Posts: 115
Joined: Tue Oct 21, 2003 11:45 am

Post by clshore »

At my client site, they uses ClearCase. It doesn't play nicely with a *.dsx file, but is OK with *.xml, hence that's what we export for checkin/checkout.

I also prefer *.dsx for my own uses.

Carter
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

Can someone please expand on why ClearCase doesn't play nicely with .dsx files?

We are about to install ClearCase to control things like unix scripts and java code, and would like to also use it for Datastage jobs (we are using Datastage PX, v7.1).

We will need the ability to merge changes made by different developers to the same datastage job.

Version Control doesn't seem to do it.
ClearCase can do a merge, but I'm worried about whether it can handle .dsx (or even .xml) files, especially since the "elements" of the code can be in any order within the file.

Words of Wisdom would be much appreciated!

Thanks,

- g[/list]
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Gazelle wrote:We will need the ability to merge changes made by different developers to the same datastage job.
Curious how you are managing this. Version Control doesn't support this functionality because DataStage doesn't either. :? Only one developer can have a job open at any given time, so unless you are going out of your way with export/import... there won't be anything to 'merge'.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

The actual structure is still being debated, but there may be multiple Datastage Projects, with some "common" jobs. If the common jobs are changed, then the changes will need to be consolidated before they are released to the production environment.

But you are right; if we cannot easily consolidate changes, then we may need to keep one Project, or be very disciplined with who changes "common" jobs.

Has anyone worked in such a "parallel development" project?
How were changes "merged"?

- g
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Best practice in this case is to make the jobs as atomic as possible, so that there's never any need to merge except at the control (job sequence) level. IMHO.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Agreed. You really shouldn't have any jobs so large that they require multiple developers to work on different parts of them.

We manage many projects as well, it's currently around 15, primarily divided by subject area. We also have common jobs and routines, but there are rules in place for where they are modified. We've settled on a main 'home' project for common objects, which is the only place changes are allowed. Any changes made are then proprogated out to the other projects where they are (typically) made read only. Version Control helps make this process fairly painless.
-craig

"You can never have too many knives" -- Logan Nine Fingers
dsxdev
Participant
Posts: 92
Joined: Mon Sep 20, 2004 8:37 am

Post by dsxdev »

Hi
Though a .xml file is larger than .dsx file it is much easier to read and go through. A .xml export of a DataStage job can be easily formated and is more readable.

In also has the advantage of integrating the code and metedata into some other code for parsing. This is not possible with .dsx file.
Happy DataStaging
Gazelle
Premium Member
Premium Member
Posts: 108
Joined: Mon Nov 24, 2003 11:36 pm
Location: Australia (Melbourne)

Post by Gazelle »

It is not that the jobs are so large that they need multiple developers, but that the development will be split into separate projects.
With some jobs, since we are using PX, we may deliberately "combine" jobs into one large job to minimise the number of times the data lands to disk... but I imagine that this will be done by a single developer.

I like the idea of creating a separate project for "common" jobs, and sending "read-only" copies of the job to the other projects. I'll have to hit the manuals and work out how to do it! Thankyou all for your comments.

Regarding the use of xml:
Has anyone experienced problems with using ClearCase, with either *.dsx files or *.xml files?
It looks like xml is preferred, since it traps metadata changes. If we export the routines, do they also get included in the xml file?

Thanks,

- g
jzparad
Charter Member
Charter Member
Posts: 151
Joined: Thu Apr 01, 2004 9:37 pm

Post by jzparad »

At the site I'm currently working at, there are a whole lot of QA standards used with DS jobs (e.g. all stages must be commented, all stage variables must be commented, short and long job descriptions must be filled in.)

Normally you would have to open various DS objects to check all these when doing a review. I've found that the XML export is much easier specially as how it allows you to define a reference to an XSLT document.

I wrote an XSLT document that looks for all the requirements and highlights any contraventions to the standard.

It's a lot quicker and easier.
Jim Paradies
Post Reply