How to Split the Huge Data?
Moderators: chulett, rschirm, roy
How to Split the Huge Data?
Hi,
I have 4GB Source data,i splitted 2GB,2GB By using like this
my filename is CALL_HISTORY_DETAILS.txt
c:/NRD> du -sh CALL_HIST_DETAILS.txt---->(It contains the file size(4GB))
c:/NRD>wc -l CALL_HIST_DETAILS.txt------->total count is 9868002
C:/NRD>head -4934001 CALL_HIST_DETAILS.txt > test.txt
C:/NRD>tail -4934001 CALL_HIST_DETAILS.txt > test1.txt
When i used tail(test1.txt) command is working fine and it is able to read the data. But,when i used head(test.txt) command it is not working and it is unable to read the data it is showing error.
Let me know why head command(test.txt) is not working? Is this currect process to split the data?
I have 4GB Source data,i splitted 2GB,2GB By using like this
my filename is CALL_HISTORY_DETAILS.txt
c:/NRD> du -sh CALL_HIST_DETAILS.txt---->(It contains the file size(4GB))
c:/NRD>wc -l CALL_HIST_DETAILS.txt------->total count is 9868002
C:/NRD>head -4934001 CALL_HIST_DETAILS.txt > test.txt
C:/NRD>tail -4934001 CALL_HIST_DETAILS.txt > test1.txt
When i used tail(test1.txt) command is working fine and it is able to read the data. But,when i used head(test.txt) command it is not working and it is unable to read the data it is showing error.
Let me know why head command(test.txt) is not working? Is this currect process to split the data?
-
- Participant
- Posts: 62
- Joined: Thu Feb 08, 2007 6:01 am
- Location: Pune
Hi crystal,
i am getting below error..
Sequential_File_0,0: Error reading on import.
Sequential_File_0,0: Consumed more than 100,000 bytes looking for record delimiter; aborting
Sequential_File_0,0: Import error at record 0.
Sequential_File_0,0: The runLocally() of the operator failed.
But,Tail command is working fine...
is there any another alternative process for split the data?
i am getting below error..
Sequential_File_0,0: Error reading on import.
Sequential_File_0,0: Consumed more than 100,000 bytes looking for record delimiter; aborting
Sequential_File_0,0: Import error at record 0.
Sequential_File_0,0: The runLocally() of the operator failed.
But,Tail command is working fine...
is there any another alternative process for split the data?
Use Split command to split the file
Code: Select all
split -4934001 CALL_HIST_DETAILS.txt test
You are the creator of your destiny - Swami Vivekananda
-
- Premium Member
- Posts: 536
- Joined: Thu Oct 11, 2007 1:48 am
- Location: Bangalore
Hi,
You can try with sed command
You can try with sed command
Code: Select all
END=`wc -l CALL_HIST_DETAILS.txt|awk -F" " '{print $1}'`
sed -n '1,4934001p' CALL_HIST_DETAILS.txt > test.txt
sed -n '4934002,'$END'p' CALL_HIST_DETAILS.txt > test1.txt
Thanks
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
Few changes to prasson_ibm's code.
Code: Select all
END=`wc -l < CALL_HIST_DETAILS.txt`
sed -n '1,4934001{p;4934001q;}' CALL_HIST_DETAILS.txt > test.txt
sed -n '4934002,$p' CALL_HIST_DETAILS.txt > test1.txt
You are the creator of your destiny - Swami Vivekananda
-
- Participant
- Posts: 117
- Joined: Wed Feb 06, 2013 9:24 am
- Location: Chennai,TN, India
i guess. see am guessing, i guess. ok
Do it have more than 100,000 character. And what is the record delimiter that you have specified in the format of the sequential file stage. And do you have new line character in the first record.
Remove the first line and run
What is the first record in your head file. Does it have column header.Sequential_File_0,0: Consumed more than 100,000 bytes looking for record delimiter; aborting
Sequential_File_0,0: Import error at record 0
Do it have more than 100,000 character. And what is the record delimiter that you have specified in the format of the sequential file stage. And do you have new line character in the first record.
Remove the first line and run
Thanks,
Prasanna
Prasanna
-
- Participant
- Posts: 117
- Joined: Wed Feb 06, 2013 9:24 am
- Location: Chennai,TN, India