Can SSELECT command be used in the "Hashed File" S

olgc · Post by **olgc** » Thu Dec 16, 2004 3:29 pm

Hi there,

Can a hashed file be read from in the sorted (key) order and How (Not in BASIC CODE, but from the visual stage "Hashed File")?

Or can SSELECT command be used in the "Hashed File" Stage to read contents from it in the sorted order?

Thanks,

mhester · Post by **mhester** » Thu Dec 16, 2004 4:27 pm

The Hash stage issues nothing more than a SELECT against the file. You can add either BY {fieldname} to sort in ascending order or BY.DSND {fieldname} to sort in descending order.

fieldname can be any valid field that exists in the file and if you want to sort by the key then you can use @ID or the actual field name if it exists in the dictionary of the file.

Regards,

olgc · Post by **olgc** » Thu Dec 16, 2004 4:45 pm

Thanks, I do not understand "You can add either BY {fieldname} to sort in ascending order or BY.DSND {fieldname} to sort in descending order. "

Add "BY {fieldname}" to where? To the "Hashed File" stage, or in the BASIC CODE? Is it possible to use the "Hashed File" stage to create a hashed file. When using "Hashed FIle" stage read from the file, it will be in the order of the key.

Regards,

mhester · Post by **mhester** » Thu Dec 16, 2004 4:52 pm

My apologies for not clearly describing what I meant. If you are reading from a Hash (not a reference link) and you choose the Selection tab on the Output page you will notice that DS already supplies the base select statement as -

SELECT {FILENAME} WITH: - {FILENAME} will reflect what you enter as a filename.

You now have the opportunity to add to this statement and one possible way would be to select the file by ascending or descending key and it would look something like -

@ID BY @ID - sort ascending

OR

@ID BY.DSND @ID - sort descending

@ID exists in every Hash file created so this is safe if you want to sort on the key of the file.

I hope this clarifies what I had meant.

Regards,

kcbland · Post by **kcbland** » Thu Dec 16, 2004 5:39 pm

The SELECT statement is not available on externally pathed hash files, you must use a project-based hash file. So, externally pathed hash files have to have a VOC entry in the project (search forum for SETFILE), and you no longer use the fully qualified path in the front tab of the hash file stage, use the project/account radio button and leave the field blank (defaults to the project the job is running in). The hash file can exist outside the project directory but requires the VOC entry to use the ordering feature Mike has demonstrated.

ray.wurlod · Post by **ray.wurlod** » Thu Dec 16, 2004 5:58 pm

If the hashed file stage has been created in an account, or if you have created a pointer to it in the VOC (using SETFILE perhaps), then specify "in account" in the Hashed File stage.

This will enable the Selection tab.

The first thing needed there is something to go in the WITH phrase (for example 1 = 1). You can then follow it with your BY phrase. For example, the Selection tab may contain

Code: Select all

1 = 1 BY @ID

You may find it easier (more intuitive) to use a UV to select rows from the hashed file. In this case, the ORDER BY clause of SQL (on the Selection tab) specifies sorting.

olgc · Post by **olgc** » Tue Dec 21, 2004 9:51 am

I try all above suggestions as the following, but not one works:

First:

SELECT H_RetDevLoc WITH
1 = 1 BY @ID

Error: LoadLotWagerCancelTActivity..H_cLookup.DSLink21: DSD.UVOpen Retrieve: syntax error. Unexpected explicit item id. Token was "1".
Scanned command was SELECT H_RetDevLoc WITH '1'
.

Second:

SELECT H_RetDevLoc WITH
BY @ID

Error:

LoadLotWagerCancelTActivity..H_cLookup.DSLink21: DSD.UVOpen Retrieve: syntax error. Unexpected symbol. Token was "BY".
Scanned command was SELECT H_RetDevLoc WITH BY
.

Third:

SELECT H_RetDevLoc WITH
@ID BY @ID

The output file is not sorted by the key.

Any idea?

Thanks,

mhester · Post by **mhester** » Tue Dec 21, 2004 9:55 am

SELECT {filename} WITH @ID BY @ID does sort the incoming recordset, but of course the output file will not be sorted if you view the data because that view of the world is not sorted!. All the SELECT accomplished was to sort the incoming recordset by ascending key.

What is your target file type?

Regards,

olgc · Post by **olgc** » Tue Dec 21, 2004 10:04 am

Hi Michael,

The target file is a sequence file. I run the job, and view from the file. It is not sorted.

All three are run from Director and get almost same error messages as I posted.

I try

@ID = @ID BY @ID.

too. But it is the same: not sorted in output file.

Thanks,

kcbland · Post by **kcbland** » Tue Dec 21, 2004 10:12 am

Make sure the hash file has been created using the EXACT metadata of the job. If your hash file has a "dictionary" listing other metadata as the primary key because it was created once previously and not removed and recreate under the newer, correct metadata, you will not get the expected results.

In addition, if the dictionary has a column listed as one datatype versus the actual, you could end up doing a left-justified sort rather than a right-justified sort, which causes numbers to not sort correctly.

olgc · Post by **olgc** » Tue Dec 21, 2004 12:48 pm

Yeah, I save the metadata and load it into in the same Hashed file stage. There are only two columns in the file. There is no chance to make mistake. There is no "dictionary" listing anyplace in the stage.

Thanks,

kcbland · Post by **kcbland** » Tue Dec 21, 2004 12:59 pm

The dictionary is a "D_yourhashfilename" file created when a hash file is created. It is what the SELECT command uses to parse the hash file. So, if the hash file has a bad dictionary, it's not used by the hash file stage. But, when you use the SELECT feature with clauses, it DOES use the dictionary, not the job metadata. If you only have 2 columns the odds that this is the problem go way down.

You're also probably better off writing the hash file as-is to a sequential file and then sorting it using the Unix sort statement. It will be much faster.

mhester · Post by **mhester** » Tue Dec 21, 2004 1:06 pm

I'm not sure why yours is giving you grief, but I created a job with the source being a hash table, one transformer and a sequential output. I added the following to the selection tab -

Code: Select all

@ID BY @ID

So essentially my select statement is -

Code: Select all

SELECT FILE WITH @ID BY @ID

after I run this and view the results I can verify that my sequential file is sorted. I can also view the input hash and see that it too is sorted.

I agree with what Ken outlined, but believe that, unless you deleted the dictionary, that the dict item @ID would be present and is always created by default.

Check to make sure that on your input column page that the correct position and type are entered for the column. If you are using a key column then its position should be 0 and type should be S. You can either enter this manually if it is not there or reimport the metadata after a validation of the job that creates the hash and reload the link.

Regards,

dsrules · Post by **dsrules** » Tue Dec 21, 2004 1:43 pm

Just browsing through the forum, I tried what u said, works perfectly fine. Thanks!!!

mhester · Post by **mhester** » Tue Dec 21, 2004 1:49 pm

Would you please elaborate on what "worked fine"? There were a couple of different solutions.

Regards,