How to count records in unix

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
chrisjones
Participant
Posts: 194
Joined: Thu May 11, 2006 9:42 am

How to count records in unix

Post by chrisjones »

Hi ,

In XML i am having records, every record is separated by delilmiter(@@#),so based on this delimiter i want to count the number of records.

so can you help me with syntax how to count records..
Thanks,
Chris Jones
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't think I've ever seen XML with delimiters like that, but regardless...

Clarify exactly what you've got. Is it one long 'record' that you need to break apart around this three-character delimiter or something else? If you do a 'wc -l' on the file, does it return a 1?

I don't recall any built in function that can handle a delimiter like that. You may need to build something like an awk script - or a C program - to read the file and handle the records based on those delimiters. How are you parsing this now? :?

What's your ultimate goal with all of these 'in UNIX' questions? Perhaps if you said what you were attempting to do at a more strategic (not tactical) level you could get some better advice on how to approach it...
-craig

"You can never have too many knives" -- Logan Nine Fingers
chrisjones
Participant
Posts: 194
Joined: Thu May 11, 2006 9:42 am

How to count records in unix

Post by chrisjones »

Thanks Hullet for your quick reply,
Actually i am generating an xml file by combining DAT Files using unix script ,in that single xml file i am having multple records separated by delimiter(@@#),so i want to count the number of records in that xml file in order to compare the records in the xml file with Flag file records in order to make sure that we receive the correct records from source..

Thanks,
Chris



i aim to
chulett wrote:I don't think I've ever seen XML with delimiters like that, but regardless...

Clarify exactly what you've got. Is it one long 'record' that you need to break apart around this three-character delimiter or something else? If you do a 'wc -l' on the file, does it return a 1?

I don't recall any built in function that can handle a delimiter like that. You may need to build something like an awk script - or a C program - to read the file and handle the records based on those delimiters. How are you parsing this now? :?

What's your ultimate goal with all of these 'in UNIX' questions? Perhaps if you said what you were attempting to do at a more strategic (not tactical) level you could get some better advice on how to approach it...
Thanks,
Chris Jones
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Clear as mud. How does one 'generate an xml file by combining DAT files'? :?

Perhaps it would help if you posted examples of these oddly delimited records.
-craig

"You can never have too many knives" -- Logan Nine Fingers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

A piece of advice. These questions are more appropriate for a Unix forum rather than DataStage. I am not saying we wont help you, dont get me wrong, what i am trying to imply here is that you will get answers more quickly or even find them by googling it much faster.
Regards,
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
kumar_s
Charter Member
Charter Member
Posts: 5245
Joined: Thu Jun 16, 2005 11:00 pm

Post by kumar_s »

Why only in unix and not in datastage?
Impossible doesn't mean 'it is not possible' actually means... 'NOBODY HAS DONE IT SO FAR'
ameyvaidya
Charter Member
Charter Member
Posts: 166
Joined: Wed Mar 16, 2005 6:52 am
Location: Mumbai, India

Post by ameyvaidya »

with the limited detail available this is what I'd try out:

If the XML File has only one delimiter present per line:
grep -c Displays only a count of matching lines.
for More information check "man grep" in Unix.

Using DataStage:
Read in the entire file (Folder Stage). Use the DCount() function. Search the BASIC guide for more details.
Amey Vaidya<i>
I am rarely happier than when spending an entire day programming my computer to perform automatically a task that it would otherwise take me a good ten seconds to do by hand.</i>
<i>- Douglas Adams</i>
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

grep -c will only return 1 if there are no line terminators. The OP has still not confirmed whether is a single huge line or multiple lines. For that he needs to perform a sed operation to replace the martian delimiter with a LF and then simply do a wc -l. That should do it.
Something like

Code: Select all

sed 's/@@#/\n/g' yourFile.xml | wc -l
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
clshore
Charter Member
Charter Member
Posts: 115
Joined: Tue Oct 21, 2003 11:45 am

Post by clshore »

Or,
try nawk:

nawk -F'@@#' '{print NF}' yourFile.xml

The -F specifies a Regular Expression for field seperator, the NF is the number of fields.

I don't think '@' or '#' are ERE metacharacters, but if so, just backslash escape them.

Carter
Post Reply