Overriding Collate Conventions
Posted: Thu Nov 20, 2008 1:23 pm
So I have a problem with a ? mark not being sorted in the way I would like. I would like to fix this by using a custom collating file.
So I make a test job.
row gen - > sort stage, descending by sort_key -> dataset
sort_key values:
A, ?, ' ',4, 2
I set every stage to sequential
I edit a file /directory/collate_test to contain the following:
&a<4<?<' '
I also make it's inverse, in collate_test1, just for fun:
&a>4>?>' '
I run the job with our project default collating map to establish a baseline:
output is
a,?,4,2,' '
I run the job with both other maps, and I get the same results. I dumped the score, and the sort operator shows the file name but I can't necessarily see the actual contents of my collation file anywhere, translated or otherwise.
I tried purposely making a bad override file, and it produced no errors, nothing different in the OSH.
Can anyone help me shed some light on how to use this feature, or point me to a PTR, an environment variable, a developer works article, anything?
Thanks.
Bryan.
dump of the sort operator
text="\n\ntsort\n-stable\n-stats\n-key 'sort_key'\n-desc\n-collation_sequence '\\/directory\\/collate_test'\n\n[ident('Sort_3'); jobmon_ident('Sort_3'); seq]\n0< 'Row_Generator_0:DSLink2.v'\n0> [modify(keep sort_key;)] 'Sort_3:DSLink5.v'",
line=12, column=1, name=tsort, qualname=Sort_3,
wrapout={},
wrapperfile=tsort, kind=non_wrapper_cdi_op, exec_mode=seq,
[modify(keep sort_key;)] 'Sort_3:DSLink5.v'",
line=12, column=1, name=tsort, qualname=Sort_3,
wrapout={},
wrapperfile=tsort, kind=non_wrapper_cdi_op, exec_mode=seq,
args="'-stats'-key'sort_key'-desc'-collation_sequence'/directory/collate_test'",
input={ text="\n0< 'Row_Generator_0:DSLink2.v'", line=20, column=1,
name="", qualname="Sort_3[i0]",
data="Row_Generator_0:DSLink2.v"
So I make a test job.
row gen - > sort stage, descending by sort_key -> dataset
sort_key values:
A, ?, ' ',4, 2
I set every stage to sequential
I edit a file /directory/collate_test to contain the following:
&a<4<?<' '
I also make it's inverse, in collate_test1, just for fun:
&a>4>?>' '
I run the job with our project default collating map to establish a baseline:
output is
a,?,4,2,' '
I run the job with both other maps, and I get the same results. I dumped the score, and the sort operator shows the file name but I can't necessarily see the actual contents of my collation file anywhere, translated or otherwise.
I tried purposely making a bad override file, and it produced no errors, nothing different in the OSH.
Can anyone help me shed some light on how to use this feature, or point me to a PTR, an environment variable, a developer works article, anything?
Thanks.
Bryan.
dump of the sort operator
text="\n\ntsort\n-stable\n-stats\n-key 'sort_key'\n-desc\n-collation_sequence '\\/directory\\/collate_test'\n\n[ident('Sort_3'); jobmon_ident('Sort_3'); seq]\n0< 'Row_Generator_0:DSLink2.v'\n0> [modify(keep sort_key;)] 'Sort_3:DSLink5.v'",
line=12, column=1, name=tsort, qualname=Sort_3,
wrapout={},
wrapperfile=tsort, kind=non_wrapper_cdi_op, exec_mode=seq,
[modify(keep sort_key;)] 'Sort_3:DSLink5.v'",
line=12, column=1, name=tsort, qualname=Sort_3,
wrapout={},
wrapperfile=tsort, kind=non_wrapper_cdi_op, exec_mode=seq,
args="'-stats'-key'sort_key'-desc'-collation_sequence'/directory/collate_test'",
input={ text="\n0< 'Row_Generator_0:DSLink2.v'", line=20, column=1,
name="", qualname="Sort_3[i0]",
data="Row_Generator_0:DSLink2.v"