Header Record Validation in a fixed width file
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 12
- Joined: Fri Apr 12, 2013 7:12 am
Header Record Validation in a fixed width file
Hi
I have a fixed width file with header record and detail record. Header record layout is different from that of detail record.
For example, I have the data:
00Employee_Details2013-05-242 -------------->HeaderRecord
012012-01-01Ram29842013-05-24 23:23:45 --------------->Detail Records
012012-02-02Sita39842013-13-24 23:23:45
The header record has the layout as:
Headerrecord_type file_description Date_created record_count
The Detail record has the layout as:
Detailrecord_type join_date Name Emp_Id RecordCreate_date
I have to check if the header "Date_created" is equal to the detail "Record Create date" . Any idea on how to store the header record separately for processing?
Thanks
I have a fixed width file with header record and detail record. Header record layout is different from that of detail record.
For example, I have the data:
00Employee_Details2013-05-242 -------------->HeaderRecord
012012-01-01Ram29842013-05-24 23:23:45 --------------->Detail Records
012012-02-02Sita39842013-13-24 23:23:45
The header record has the layout as:
Headerrecord_type file_description Date_created record_count
The Detail record has the layout as:
Detailrecord_type join_date Name Emp_Id RecordCreate_date
I have to check if the header "Date_created" is equal to the detail "Record Create date" . Any idea on how to store the header record separately for processing?
Thanks
-
- Premium Member
- Posts: 536
- Joined: Thu Oct 11, 2007 1:48 am
- Location: Bangalore
Hi,
Before processing a file,design file validation job.Use two external source stages,your job design will be like this:-
External source1---->Col Gen dummy-->Join On dummy----Transf---Copy
External Source2---->Col Gen Dummy--->
In External Source1 Stage Write below unix command:-
In External Source2 Stage write below unix command:-
Inner join on dummy column.In transformer validate external src1 and external src2 output,if it is matching your file validation is successful otherwise failed.
Before processing a file,design file validation job.Use two external source stages,your job design will be like this:-
External source1---->Col Gen dummy-->Join On dummy----Transf---Copy
External Source2---->Col Gen Dummy--->
In External Source1 Stage Write below unix command:-
Code: Select all
head -1 File Name|cut -c19-27
Code: Select all
head -2 File Name|tail -1|cut -c20-28
Inner join on dummy column.In transformer validate external src1 and external src2 output,if it is matching your file validation is successful otherwise failed.
Thanks
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 536
- Joined: Thu Oct 11, 2007 1:48 am
- Location: Bangalore
Hi Ray,
I saw they are running datastage on unix platform,thats why i suggested.
I saw they are running datastage on unix platform,thats why i suggested.
Thanks
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
Why don't we start by suggesting DataStage solutions and only fall back to the O/S heavy stuff if there's no other choice or the "in job" solution is... unwieldy.prasson_ibm wrote:I saw they are running datastage on unix platform,thats why i suggested.
Last edited by chulett on Tue May 28, 2013 6:52 am, edited 1 time in total.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Several ways to approach this.
Read the file's record as one long string and check the first two bytes for either "00" or "01" to determine header v. detail... or if there's only one header record per file you could use record #1 v. all other records. Store the substring'd header value in a stage variable and compare it to the detail records. Pass out any detail records where it doesn't match. Target records > 0 = bad file.
Read the file's record as one long string and check the first two bytes for either "00" or "01" to determine header v. detail... or if there's only one header record per file you could use record #1 v. all other records. Store the substring'd header value in a stage variable and compare it to the detail records. Pass out any detail records where it doesn't match. Target records > 0 = bad file.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Premium Member
- Posts: 536
- Joined: Thu Oct 11, 2007 1:48 am
- Location: Bangalore
Hi Chullet,
Sure,you are saying is logical,thanks for your suggestion.
Sure,you are saying is logical,thanks for your suggestion.
Thanks
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
Prasoon
ETL Consultant
LinkedIn :- http://www.linkedin.com/profile/view?id ... ab_pro_top
Blog:- http://dsshar.blogspot.com/
I suggest using Complex Flat File for the initial input. The first two bytes look to be the record type, and you can set up CFF with a separate output link for each type. You can then attempt a simple Join and have it abort on the date not matching. The advantage there is if you have some details with the correct date and you want the job to process them, you also have that option.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
For this scenario, below job design works fine. the properties are in verison 8.5.
seq file stage1---->Col Gen dummy-->Join On dummy(inner join) ---->Transf---->(seq file output stage1,seq file output stage2)
seq file stage2---->Col Gen Dummy--->
Seq file stage1 : to read the detail records ( specify detail record layout)
Read all detail records except the header by using SEQ file stage ( set property: first line in column names= True)
Seq file stage2 : to read the header record ( specify header record layout)
Read only the first line by using SEQ file stage ( set property: read first rows =1 and first line in column names= False)
In both the flows , in column generator ,create a new column called dummy and set the properties to Type= cycle, initial value = 1 increment =0
In transformer create a stage var with a derivation to compare the date fields.
stagevardate: IF ( Lnk_Match_date.Date_created = Lnk_Match_date.Record Create date ) THEN 'MATCH' ELSE 'NOMATCH'
set the constraint in the transformer to get the stagevardate = "MATCH"
and another link constraint to stagevardate = "NOMATCH"
to get matching records in 1 file and not matching records to the another sequential files ouput
seq file stage1---->Col Gen dummy-->Join On dummy(inner join) ---->Transf---->(seq file output stage1,seq file output stage2)
seq file stage2---->Col Gen Dummy--->
Seq file stage1 : to read the detail records ( specify detail record layout)
Read all detail records except the header by using SEQ file stage ( set property: first line in column names= True)
Seq file stage2 : to read the header record ( specify header record layout)
Read only the first line by using SEQ file stage ( set property: read first rows =1 and first line in column names= False)
In both the flows , in column generator ,create a new column called dummy and set the properties to Type= cycle, initial value = 1 increment =0
In transformer create a stage var with a derivation to compare the date fields.
stagevardate: IF ( Lnk_Match_date.Date_created = Lnk_Match_date.Record Create date ) THEN 'MATCH' ELSE 'NOMATCH'
set the constraint in the transformer to get the stagevardate = "MATCH"
and another link constraint to stagevardate = "NOMATCH"
to get matching records in 1 file and not matching records to the another sequential files ouput
-
- Participant
- Posts: 12
- Joined: Fri Apr 12, 2013 7:12 am