Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.
Moderators: chulett , rschirm , roy
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Thu Aug 18, 2016 10:58 pm
Hello All
I would like to understand the reason why the Datastage Sequential file stage is not able read the file which more the xGB?
Several link shows the limit is in the OS level and not with the Datastage, but my interest is slightly differ.
I have 6GB text file / 18 million records.
OS Windows 2008 R2 / 64 Bit
Code: Select all
SEQ_FILE --> ODBC
Datastage can load up to 12 million records / up to 4 GB data (due to the OS limitation).
Same OS with SQL Server 2008.
Code: Select all
SQLSERVER Import wizard to Table.
I can use the import wizard to load the same data file without any limitation issues ! (I haven't tried SSIS)
End of the day, trying to load into table.
What is the difference the way how the file has been read by the import wizard vs Datastage?
Please throw some light
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
ray.wurlod
Participant
Posts: 54607 Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:
Post
by ray.wurlod » Fri Aug 19, 2016 3:53 pm
Have you tried using multiple readers?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Sun Aug 21, 2016 1:27 pm
Haven't think that one Ray.
Thank you so much for your direction; let me try it and get back to you.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
Mike
Premium Member
Posts: 1021 Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL
Post
by Mike » Sun Aug 21, 2016 2:53 pm
First thing to do is isolate the issue. I doubt that reading from the sequential file is a problem.
Create a copy of your job that looks like this:
I would suspect the ODBC driver or the ODBC DSN set up before I would suspect the sequential file read.
Mike
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Sun Aug 21, 2016 5:39 pm
Thanks Mike
I will check this too.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Sun Aug 21, 2016 11:50 pm
Thanks Mike.
It could be the ODBC limit, as suggested, the job with copy stage pulled all the rows.
Thanks Ray
You are 100% right, when i increased the readers as suggested, solved the problem.
Thank you so much guys for your valuable input.
Another good lesson for me. :D
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Mon Aug 29, 2016 6:49 pm
Sorry guys
I can't believe , all of the sudden, the same job failing again.
This time i am getting the below error.
SRC_TRANS,0: Error reading on import.
SRC_TRANS,0: Consumed more than 100000 bytes looking for record delimiter; aborting
Any changes?
No changes to the Server, Job, OS...
Then how this job ran successfully and managed to load the same file before?
I have no clue!
I found the IBM suggestion, which didn't worked well.
http://www-01.ibm.com/support/docview.w ... wg21651999
On that day when Mike suggested to load the data using copy stage. It ran successfully, but today that job also failing with same error.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
chulett
Charter Member
Posts: 43085 Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO
Post
by chulett » Mon Aug 29, 2016 7:07 pm
Then it is no longer the 'same file' if suddenly it can't find the record delimiter.
-craig
"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Mon Aug 29, 2016 7:12 pm
Thanks craig.
But i am 100% sure there is no changes in the file, because that same file is sitting in the project folder (in my PC) which i have used before.
I was doing the end to end testing and I didn't used the large file to minimize the load time. Hence my testing is done, I tried to use the same file again which I have used before.
Anyhow lets see.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
chulett
Charter Member
Posts: 43085 Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO
Post
by chulett » Tue Aug 30, 2016 5:19 am
Something is different, obviously. Be curious what you end up tracking down.
-craig
"You can never have too many knives" -- Logan Nine Fingers
SURA
Premium Member
Posts: 1229 Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney
Post
by SURA » Tue Aug 30, 2016 5:41 pm
Yes, that's right.
If i recall correctly , as i have mentioned in my initial thread, in total 18 mil+ records. But when i run the first load ETL successfully loaded 12 million records, but didn't given any warnings or error about not loading the rest of the records.
Is it how the ETL tool will behave? I don't know !
Anyhow i will crack this down.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.