DSXchange

Ultramundane

I had tested with values and the transformer and filter performed about the same. The switch was slower than the transformer and filter. However, all were fast.

Ultramundane

As was mentioned earlier there are multiple solutions. I am just offering another solution (on Unix) for those that might be interested. Use an external filter stage with the below code for MIN or MAX: ## Author: Ryan Putnam ## Intent: Get MAX awk 'BEGIN {getline; MAX=$0;} {if (( $0 > MAX )) { MAX=$...

Ultramundane

You could also get your dsadm to download patch 61040 from Ascential which allows you to perform min and max on char and varchar columns. If you get the patch and apply it a new option called preserveType will become available when choosing the columns to aggregate. Select this property and set it t...

Ultramundane

I tested the same examples used with the transformer and switch with the filter stage. The filter stage performed exactly the same as the transformer. For the examples I tested, the transformer and filter performed better than the switch.

Ultramundane

I ran some tests with a row generator (10,000,000 records) and 6 values to switch upon. The transformer ran 100,000 rows/second faster than the switch stage. 400,000 rws/sec vs 300,000 rws/sec. I ran a test where I dropped the values c,d,e,f using the switch (drop) by only mapping a,b and used a con...

Ultramundane

I had asked this question of the vendor. Ascential said that datasets allocate nearly the full amount of space that could be used by column even when no data for that column exists. Thus, we have to use sequential files or the workaround they provided. They said this was a performance enhancement ma...

Ultramundane

Thanks for the tip. Vertical pivoting in the transformer stage has a small overhead in using the compiled stage. It is extremely efficient in that it only collects and holds a small amount of information in memory and refreshes that information with each key change. The pre-sorting does the hard wo...

Ultramundane

I think one can reasonably deduce that the below statement did have implications/claims about Datastage being faster.

"I did this once using awk,but its too slow for huge volumes of data. So i want this to be done in datastage."

Ultramundane

I believe it is a bug. I don't think my chair or keyboard has anything to do with the issue. Maybe my brain or this product has the issue. But not my keyboard or chair as they are inanimate objects. Anyways, the job is simple, straight database to database. DB -> DB. I created the tables with the ex...

Ultramundane

It might be faster with the vertical pivot, but with the horizontal pivot (even in parallel), awk is still 20 times faster. I have a case open with Ascential on the terrible performance of this stage. But this is a different issue. horizontal vs. vertical. Or, maybe it isn't. Since the stage is call...

Ultramundane

I believe Pavan originally stated that an awk script he wrote was slow and that awk was too slow and that he wanted to code this in Datastage because it is faster?

Ultramundane

Forgot that you could set ORS to an empty string. This one is much faster and does not slow down with more frequent values. It just keeps streaming the input out until a change occurs. #!/bin/ksh ################################################################################################# ## ## ...

Ultramundane

I wrote a solution to the specific problem at hand using awk and it is very fast. It can process 2.5 million records at 712 Bytes/Record in 140 seconds on my system with A changing every 10 values. That is, process a 1,780,000,000 Byte file with 2.5 million records and A changing every 10 values in ...

Ultramundane

That is what I thought, but my Datastage jobs blow-up when timestamp, char, and integer columns are configured as NULLABLE and I do not specify an alternate isnull value. I am just testing at the moment and both tables have the same schema. So, just a straight download and upload of records. However...

Ultramundane

Posting on EE forum with Server job type in post is confusing! Yeah, I made a mistake and posted twice. Shouldn't have done that. Thanks, your input it makes sense. I still don't understand why I don't have the capability from the source input columns to specify this when the source stage is a datab...

DSXchange

Search found 402 matches

Re: Problem with Aggregator

Re: TRANSFORMER cf SWITCH speed benchmark