Getting the warning and droping those records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Sridhar Sivakoti
Participant
Posts: 35
Joined: Tue Feb 13, 2007 5:30 am

Getting the warning and droping those records

Post by Sridhar Sivakoti »

Hi,

We have a job reading the data from .CSV file(Contains 30 million records) through sequential file stage. When we are running this job we are getting the below warning for some records and those records are dropping.

seqSVCRegCustomers,0: Field "AFFIL_EMAIL_CNTCT_IND" with 'delim=end' did not consume entire input, at offset: 162
seqSVCRegCustomers,0: Import warning at record 732966.
seqSVCRegCustomers,0: Import unsuccessful at record 732966.


I want to eliminate these warnings and droppings.

Please let me know how we can do this.

Thanks
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Your data and metadata need to match. You have rows which contain more data then your metadata allows for.
Sridhar Sivakoti
Participant
Posts: 35
Joined: Tue Feb 13, 2007 5:30 am

Post by Sridhar Sivakoti »

Thanks ArndW for your response.

I am taking the exact matching number of columns in metadata and file data.

Please let me know, is it because of wrong data in file hence I am getting the below warning.
Sridhar Sivakoti
Participant
Posts: 35
Joined: Tue Feb 13, 2007 5:30 am

Post by Sridhar Sivakoti »

Thanks ArndW for your response.

I am taking the exact matching number of columns in metadata and file data.

Please let me know, is it because of wrong data in file hence I am getting the below warning.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Your data is different in row 732966, I suggest you try

"head -732967 {SourceFile} | tail -3" to see one line before and one line after the incorrect one. The line length is short enough so that you should be able to detect the anomaly.
Sridhar Sivakoti
Participant
Posts: 35
Joined: Tue Feb 13, 2007 5:30 am

Post by Sridhar Sivakoti »

Hi Arndw,

I do not see any difference. below is the data please see.
"{CCF998D0-F60C-4211-8E4A-07C420BD64C1}",1,2006-11-09 13:45:31,"erinerinmc@hotmail.com","Erin McElroy","329 Spruce Street 1B","","Philadelphia","PA","19106","US"," "
,"1","1"," "
"{C300763E-EF7C-46F4-9A08-20989958AB67}",1,2005-01-05 06:44:10,"chiaradonna19@hotmail.com","Lauren Manze","631 Maryland Ave.","#5","Pittsburgh","PA","15232","US"," "
,"1"," "," "
"{98F48041-F9FA-4057-9763-E4390727FF13}",2,2005-01-11 09:24:39.780000000,"","Diane McKenzie","""Little Wood"" RR#1","","Hillsburgh","ON","N0B1Z0","CA"," "," ","1","
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

a) could you randomize that data a bit by editing your post and making the names and email addresses illegible?
b) did you cut off the last double quote by mistake?
c) which column is "AFFIL_EMAIL_CNTCT_IND"?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Search the forum for discussion about parallel jobs not being able to handle the conventional "" meaning a single double-quote character within a double-quote quoted string. The string "Little Wood" in your data fits into this category.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply