Shot Read in Seq File

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Shot Read in Seq File

Post by hamzaqk »

Hi all, trying to read a fixed with seq file but i am getting a message which says short read encountered. i know one of the problems can be the file not fixed d.. but i dont think its the same case here as i have managed to read the same file in the server edition without any trouble.

i am setting the fixed width property in the col definition = width of the field in the file definition.
File Definition:
field_name field_length Field_type
1 account_id 12 char
2 account_status_cd 1 char
3 credit_card_type_cd 1 char
4 Cc_holder_nm 45 char
5 Cc_holder_ssn 9 char
6 Language_preference cd 1 integer
7 Work phone 10 char
8 Home phone 10 char
9 street address 50 char
10 city nm 25 char
11 state cd 2 char
12 zip cd 5 integer
13 Balance amt 9 integer
14 open dt 8 char
15 expire dt 8 char
16 credit limit amt 9 integer
17 last payment dt 8 char
18 last--payment amt 9 integer
19 annual--percentage rate 6 integer
20 annual fee amt 7 integer

Dummy Data:
699991267250AV Ivan Ge388425101143953563022008435087 41 Eighth Avenue AlexandriaVA10020 01999010720040106 3600019990401 12 12 20 14 847110
863449393669CV Buren Euarchukiati683547240232312662436718056826 995 Second Street ClevelandOH10016 01999050520040423 4400019990502 48 20 20 163505100
743059146348AM Alcus Draband523371327124460956331357733655 771 Varick Street ArlingtonVA10032 01999011620040304 1000019990203 61 13 20 151937000
788397786674AM Manuel Bukhina582202775312333733057697443492 253 Ninth Avenue BerkeleyCA10024 01999052920040530 2000019990102 108 12 20 122533100
572236865104AV Aboelray Esaco300768423161248085726882073687 288 Seventh Avenue ArlingtonTX10012 01999010920040306 2200019990203 101 14 20 112947100
108512631528AM Krysta Ilabouni275469519168968805341678411702 629 Third Avenue ChesapeakeVA10014 01999042720040213 4900019990510 173 13 20 183021100


cheerz !
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

cc_holder_nm is definitely short. Get better data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: Short Read in Seq File

Post by chulett »

Forum software removes all of the 'extra' spaces from posts unless you wrap things in code tags, so your 'dummy' data doesn't even look close to fixed width data unless you do what Ray must have and 'Reply with Quote' to see the raw data.

Code: Select all

699991267250AV                                      Ivan Ge388425101143953563022008435087                                 41 Eighth Avenue                AlexandriaVA10020        01999010720040106    3600019990401       12    12     20     14 847110
863449393669CV                           Buren Euarchukiati683547240232312662436718056826                                995 Second Street                 ClevelandOH10016        01999050520040423    4400019990502       48    20     20     163505100
743059146348AM                                Alcus Draband523371327124460956331357733655                                771 Varick Street                 ArlingtonVA10032        01999011620040304    1000019990203       61    13     20     151937000
788397786674AM                               Manuel Bukhina582202775312333733057697443492                                 253 Ninth Avenue                  BerkeleyCA10024        01999052920040530    2000019990102      108    12     20     122533100
572236865104AV                               Aboelray Esaco300768423161248085726882073687                               288 Seventh Avenue                 ArlingtonTX10012        01999010920040306    2200019990203      101    14     20     112947100
108512631528AM                              Krysta Ilabouni275469519168968805341678411702                                 629 Third Avenue                ChesapeakeVA10014        01999042720040213    4900019990510      173    13     20     183021100
Each of the columns should line up exactly with each other and you can see your file as shown here doesn't pass that simple sniff test. How about posting some real data?
-craig

"You can never have too many knives" -- Logan Nine Fingers
hamzaqk
Participant
Posts: 249
Joined: Tue Apr 17, 2007 5:50 am
Location: islamabad

Post by hamzaqk »

Ok did not know that ! pasting the dummy data again in quotes as you mentioned . As i said before there is no problem with the data. it is fixed width and i do not have problem when i use seq while in server edition with fixed width format property

Code: Select all

208924886190AM                                Aizhu Khaalid945960253193111449703997237086                                  66 Essex Street                   AbileneTX10010     29581999040220040102    1100020010225       61    19     20     14 500010
684510097979AV                                      Alan Iv670919526113420077693060404110                                     676 Avenue C                   BuffaloNY10026     15611999042920040425    4600020010419       23    13     20     15 407100
443352178984AM                                      Jim Gri261761996139202135052320007741                                     676 Avenue C                   BuffaloNY10026     24231999041520040123    4000020010404       46    14     20     122693110
649358006797AV                               Carolyn Harper235299847170053710987022137447                                 651 Tenth Avenue                 ClevelandOH10023     28981999050720040119    3200020010224       65    16     20     182906100
656483892550AV                                Boersma Etand729618574171058192561481388597                               85 Eleventh Avenue                BridgeportCT10036      2781999051120040610     600020010501       75    12     20     142577010
534474066690AV                                    Gero Kaka878581281123257426158780314115                                755 Second Avenue                BirminghamAL10012     39041999012820040202     300020010620       64    12     20     172343100
934349443220AM                                    Alicia Ho470892629229703976313918162114                                277 Second Street                ChesapeakeVA10035     35341999052120040218    3100020010106       70    20     20     141826110
169334758342CV                                    Alicia Ho470892629229703976313918162114                                277 Second Street                ChesapeakeVA10035      1401999062520040518    2200020010422       50    14     20     19 757100
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yeah? Well THOSE data (that you posted) certainly aren't fixed width.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply