Page 1 of 1

Sequential file

Posted: Tue Dec 30, 2008 11:11 am
by kittu.raja
Hi,

I have a flat file having 2 columns. It look like this
Col1 Col2
1 adam
1 adam
2 adam
3 michael
3 michael

I want to find out the unique count of col1.

Can anybody help me out in doing that.

Thanks,

Posted: Tue Dec 30, 2008 11:39 am
by Nagaraj
You can use a unix command something like this(since the file is on unix)

nawk -F'|' '!x[$1]++' chck.txt |wc -l

I hope this is what you are looking for.

Posted: Tue Dec 30, 2008 12:00 pm
by metadata1
Within a job you could apply Aggregator Stage to group/count records - Have you thought about using that option?

Not sure what your exact requirements are -

Posted: Tue Dec 30, 2008 12:22 pm
by nsm
simply use:- sort -n -u test|wc -l
and do result-1 as your file ist line is column names.

Posted: Tue Dec 30, 2008 12:49 pm
by kittu.raja
metadata1 wrote:Within a job you could apply Aggregator Stage to group/count records - Have you thought about using that option?

Not sure what your exact requirements are -
I have used it but I am getting counts of each group. I want all the distict count of the second column.

Posted: Tue Dec 30, 2008 12:52 pm
by kittu.raja
nsm wrote:simply use:- sort -n -u test|wc -l
and do result-1 as your file ist line is column names.
I want only the distinct count of second column. Where are you specifying the second column name?

Posted: Tue Dec 30, 2008 12:53 pm
by kittu.raja
Nagaraj wrote:You can use a unix command something like this(since the file is on unix)

nawk -F'|' '!x[$1]++' chck.txt |wc -l

I hope this is what you are looking for.
Where are you specifying the column name.

Posted: Tue Dec 30, 2008 6:28 pm
by Nagaraj
$1 is the first field and $2 is the second......so on ....

Posted: Tue Dec 30, 2008 10:01 pm
by dr.murthy
I
have used it but I am getting counts of each group. I want all the distict count of the second column.
[/quote]

tell me how would be the your output result,means you need the distinct count of second col or frist col

Posted: Tue Dec 30, 2008 10:03 pm
by Nagaraj
output is just a number, why dont you try the commad which i have given in UNIX?

Posted: Wed Dec 31, 2008 12:03 am
by kishore2456
You can use aggregator, where just use aggregation and count on the same column (either first or second which you want).

Posted: Wed Dec 31, 2008 12:10 am
by kishore2456
You can use aggregator, where just use aggregation and count on the same column (either first or second which you want).

Posted: Wed Dec 31, 2008 6:59 am
by Nagaraj
kishore2456 wrote:You can use aggregator, where just use aggregation and count on the same column (either first or second which you want).
I think he is right, you can do this way as well....