Page 1 of 2

Hash File passing "?" as values

Posted: Mon Dec 04, 2006 11:55 am
by paddu
Hi All

Can anyone explain why Hash File Accepts NULL as valid value in a Key field when populating it and while reading data from Hash file it passes "?" for the particular field?????????

I have Two Jobs Designed as below .

1)Reading from a Sequential file stage (with no keys) and populating Hash file (which has keys).


2)Reading from Hash file and creating a Flat file ( problem is in job2 "passing ?")

I know i need to change the logic in the design to Handle Nulls before sending to Hash file but ..

Please let me know why Hash file does this.
Help me understand


Any ideas appreciated

Thanks
Paddu

Posted: Mon Dec 04, 2006 4:20 pm
by ray.wurlod
A hashed file (note: it's not "hash file") will never accept NULL as its key value. It will, however, accept "" (a zero length string). Of course, since keys must be unique, there can only be one record with this key value in the hashed file; multiple writes will overwrite.

"?" may mean that NLS has detected a character that it can not map, or there may be something in your job design that generates the "?" characters. You have not provided sufficient information to enable accurate diagnosis.

It may also be that the "?" characters are an artifact of the tool that you are using to view the data. How are you viewing the data in the hashed file?

Posted: Tue Dec 05, 2006 11:50 am
by paddu
Hi Ray,

Extremely sorry for "Hash file" :oops: .

i truly did not understand the below
"It may also be that the "?" characters are an artifact of the tool that you are using to view the data. How are you viewing the data in the hashed file? "

We have NLS MS1252 as Project default

Please Let me know what kind of information to provide for accurate diagnosis

Thanks
Paddu

Posted: Tue Dec 05, 2006 12:09 pm
by jdmiceli
Hi Paddu,

To add to what Ray is saying (possibly clarification, depending on whether I'm understanding correctly or not), you may need to look at how the database engine you are using treats a question mark.

In DB2 UDB, the question mark can denote a placeholder for a value being passed into a query, so I'm wondering if DataStage won't allow this because it has an intended purpose elsewhere.

Since you aren't dealing with a database engine directly according to your description of your jobs, is it possible the question mark has a meaning to the OS that DataStage is trying to interpret.

Bear in mind I'm kind of thinking out loud here to spark any thoughts in your mind or someone else's.

I'm tending to lean toward DataStage thinking this is a placeholder, but I am not an expert and I've never played one on TV. :lol:

Bestest!

Posted: Tue Dec 05, 2006 2:13 pm
by ArndW
A "?" in view data from the designer can denote a character that can't be displayed. Where are you seeing this question mark?

Posted: Tue Dec 05, 2006 4:52 pm
by paddu
I am seeing "?" mark in the flat file which i am populating using the Hashed file .basically Job2 which i mentioned in my earlier post.

Re: Hash File passing "?" as values

Posted: Tue Dec 05, 2006 5:08 pm
by I_Server_Whale
paddu wrote:Hi All

1)Reading from a Sequential file stage (with no keys) and populating Hash file (which has keys).

Do you have "?" as part of the data in your source sequential file?

Whale.

Posted: Tue Dec 05, 2006 5:41 pm
by paddu
NO Source file does not have "?"

Ofcource Source file has a blank record.

The fields could have zero length string or space or Nulls in it .No idea .We did not check that ,May be it had zero length string and that's why hashed file picked that record as Ray mentioned .

Sorry ALL for asking why Hashed file accepts NUlls as valid value in key field at first without analyzing what could be the data.

I know the solution of my jobs . i need to reject blank record before populating Hashed File.

I hope this makes sense

Thanks
paddu

Posted: Tue Dec 05, 2006 7:22 pm
by ray.wurlod
That doesn't solve the mystery of the "?" character. Can you post a couple of source records that illustrate this issue? Also tell us your exact DataStage version. Maybe someone can try to replicate.

Posted: Tue Dec 05, 2006 9:56 pm
by paddu
Hello Ray,

The version we have is Datastage 7.5x2 on Windows2003 server and client windows 2002.
We designed the project in Server jobs only. It seems my team had issues with Parallel jobs and they ended up using Server jobs only .

Unfortunately i cannot provide couple of source records.The file is too big to open in Textpad(30Million records) . Since we have keys on hashed file(and just capturing 3 fields) we get fillter data .so i cannot give exact data.


BUT i created a sample comma delimited file with 3 columns(zero length string (single record))
1)loaded the file into Hashed file(first column as key) .which for sure got the record populated(ofcourse with zero length strings fields).

2)Now when i use this hashed file to create another flat file , it passed "?" in the key field.

I think anyone can try this to see if this happens in their system.

Any suggestions appreciated
Thanks
Paddu

Posted: Tue Dec 05, 2006 11:22 pm
by ray.wurlod
Are these default hashed files (dynamic)? Is there anything unusual about the output stage?

What I'm thinking here is a behaviour of UniVerse when constructing file names, to substitute the "?" character for "". But I've not seen this occur with data.

Posted: Wed Dec 06, 2006 11:44 am
by paddu
Hashed files are default Type30(dynamic).

Nothing Unusual in Output Stage.

Did anyone try replicating my situation in their envirnoment ?

Please let me know ,if you did face situation similar to me or its only me having this sitaution.

Posted: Wed Dec 06, 2006 12:42 pm
by I_Server_Whale
paddu wrote:Hello Ray,


BUT i created a sample comma delimited file with 3 columns(zero length string (single record))
Paddu
How are the 3 columns defined as?

Can you provide the datatype and length that you used in the stages?

Posted: Wed Dec 06, 2006 1:51 pm
by paddu
I_Server_Whale -I just used Char length 3 in sample job .

Note :Below is what happening in the Original job

Ray - Below is the value I see when i open the source file . We get zipped files from mainframe and then we unzip them . we got a small file this time . i opened in textpad and see only the character below at the end of the file (fixed width file).


""
But i am surprised this record got populated in the Hashed file which basically has 3 columns (2 keys fields).Comlete empty record in Hashed file


When i create flat file out of this Hashed file this passes "?" in the two key fields.


I need to mention one more important thing .
My sample job does not have "?" today . Yesterday we re-installed DS on server version 7.5.2 . We had to do this because of BW plugin incompatibility issue :( . lot of moving parts going on . I am surprised why sample job did give any questions when i used the same zero length string record today .

Hmm anyways my original job has the situation still remained and i see that is because of "" at the end of the file .

Posted: Wed Dec 06, 2006 3:09 pm
by I_Server_Whale
paddu wrote: i opened in textpad and see only the character below at the end of the file (fixed width file).


""
Is this character part of the last column in your fixed-width file? If so and is not required, then you can filter this in the transformer before populating the hashed file.

Whale.