Hi ,
I have a huge chunk of data which i need to put into different files based on a key. which stage does this???
Primary aim is to sort the entire data but unix sort fails as the file is huge, hence devide the data into diff files and then apply trasformation then concatenate diff files.........
need to hash the incomig data into different seq files, how?
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
What is the purpose of Unix sort ? Are you doing it outside DataStage ?
I will avoid 'split' as it does by size and lines rather than content.
Create a new configuration file to use resource as per your current job's needs and use it in your job.
As ArndW sugggested, use PX partitioning. If you know the values you want to split by, you can run the job as multi-instance with each running one type.
I will avoid 'split' as it does by size and lines rather than content.
Create a new configuration file to use resource as per your current job's needs and use it in your job.
As ArndW sugggested, use PX partitioning. If you know the values you want to split by, you can run the job as multi-instance with each running one type.
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore
-
- Participant
- Posts: 3337
- Joined: Mon Jan 17, 2005 4:49 am
- Location: United Kingdom
-
- Premium Member
- Posts: 730
- Joined: Tue Nov 04, 2008 10:14 am
- Location: Bangalore