Doubt in - Sequence of records processed by DataStage
Moderators: chulett, rschirm, roy
Doubt in - Sequence of records processed by DataStage
Hi all,
I have a doubt in the way the records are processed in DataStage. I have about 100 records coming from the source and there is no filter or other such operations, so I want to have all the 100 records, which I am getting now, but I want to have the same order, how it is present in the source, ie., the first record from the source should be stored in the target table as the first record and so on, till the 100th record.
Can anyone please advice me how to get this in DataStage?
Thanks!
I have a doubt in the way the records are processed in DataStage. I have about 100 records coming from the source and there is no filter or other such operations, so I want to have all the 100 records, which I am getting now, but I want to have the same order, how it is present in the source, ie., the first record from the source should be stored in the target table as the first record and so on, till the 100th record.
Can anyone please advice me how to get this in DataStage?
Thanks!
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
What you want is exactly what DataStage does.
The target database table, however, stores the records in whatever order it wants to. This is a fundamental property of relational databases; that storage order is irrelevant.
It's not a DataStage issue.
To get sorted data out of a table, the rules state that you specify an ORDER BY clause in your query.
The target database table, however, stores the records in whatever order it wants to. This is a fundamental property of relational databases; that storage order is irrelevant.
It's not a DataStage issue.
To get sorted data out of a table, the rules state that you specify an ORDER BY clause in your query.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 301
- Joined: Thu Jul 14, 2005 10:27 am
- Location: Melbourne, Australia
- Contact:
... and if you're writing a sequential file as output (or some other stage where output order might be important) then use a sort Stage to sort by a key value.
If you need to preserve the input order in your output you can generate an appropriate key for the input data. Do a search on @INROWNUM.
J.
If you need to preserve the input order in your output you can generate an appropriate key for the input data. Do a search on @INROWNUM.
J.
<b>John McKeever</b>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
Data Migrators
<b><a href="https://www.mettleci.com">MettleCI</a> - DevOps for DataStage</b>
<a href="http://www.datamigrators.com/"><img src="https://www.datamigrators.com/assets/im ... l.png"></a>
vij - why would you want to maintain any particular order when writing to this SQL server table? Sometimes the order of keys can make a performance difference when loading, but that doesn't sound like your issue.
Basically no matter what the order is of the records you are inserting to SQL Server the same query to the database will return the same results.
Basically no matter what the order is of the records you are inserting to SQL Server the same query to the database will return the same results.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Since you are using PX edition, the order is not guaranteed. Unless you are processing in sequential mode. Since the records are processed in different CPU's, based on the resource of each CPU the records will be processed, and the order will be changed.
If you strictly need the order to be maintained, you can load based on some sequence key generated during select. It can be even the rowid/rownum.
Interestingly, there is no guaranty that two simple select query will return in the rows in same order.
If you strictly need the order to be maintained, you can load based on some sequence key generated during select. It can be even the rowid/rownum.
Interestingly, there is no guaranty that two simple select query will return in the rows in same order.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Cant he stick a transformer before the database stage and then choose combinability mode as "combinable" from the transforme stage properties ?
But as noted by others, this will still not ensure how the record is stored in the database. But if you are loading to a file then by setting the combinability mode, it should work. Still need confirmation on that from others.
But as noted by others, this will still not ensure how the record is stored in the database. But if you are loading to a file then by setting the combinability mode, it should work. Still need confirmation on that from others.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
The issue here is simply a lack of understanding on how relational databases work. As noted, there is no 'order' to be 'preserved'. If you want to get records back out of a table in the same order you inserted them, you'll need to add additional columns to the table to allow that.vij wrote:I have SQL server table as the target, but even then, the same order is not preserved. Should I enable any maintain order option?
Something like a batch number + record number in the batch would work. Then you can ORDER BY both fields and get things out like you put them in. Or a surrogate key - anything that allows you to ORDER the output properly.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Though the combinable option is enabled, its not guaranteed that the operators will be combined. Unless Dump score is overviewed. And more over this option will not affect the order of the row processing across nodes.
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
Hi all,
Let me explain the query again: It is related to loading data into a SQL SERVER DB table from the sequential file, not retriving the data from the table.
Actually I use a Sequential file to load the table and this sequential file was inturn an ouput from a job, which runs in parallel and here is where the sequence got missed out.
In the job where I am loading the table, the same sequence what I am looking for is maintained.
And now the problem is resolved. Thanks for all your help.
Let me explain the query again: It is related to loading data into a SQL SERVER DB table from the sequential file, not retriving the data from the table.
Actually I use a Sequential file to load the table and this sequential file was inturn an ouput from a job, which runs in parallel and here is where the sequence got missed out.
In the job where I am loading the table, the same sequence what I am looking for is maintained.
And now the problem is resolved. Thanks for all your help.