Passing multiple groups of recs to a shared container

tbtcust · Post by **tbtcust** » Fri Dec 06, 2013 9:57 am

Hello All,

- How can I pass multiple groups of records to a shared container for processioning at the group level?

- Is there a way to have a shared container wait/pause while a group of records is being processed?

I have a need to pass groups of records to a shared container for transform as a groups and there are inter-dependencies among the records for each group.

I have worked out how to pass the groups into the shared container as a group. However, when I send a single group through the shared container it works fine. When I send multiple groups, the groups get mixed together.

Thanks in advance for any help.

bart12872 · Post by **bart12872** » Fri Dec 06, 2013 10:32 am

Well, shared container must be seen like a simple DataStage job.
The shared container just permit to factorize process that will be use in multiple jobs.

Said in another words, shared container can't do more than a job can do.

You can't use a shared container to make a process than you can't do i a job.

In your case, I don't understand your needs, with group level and records in each group and dependencies.
Can you give an example ?

tbtcust · Post by **tbtcust** » Fri Dec 06, 2013 11:13 am

Thank you for your reply bart12872

In the example below the key field is what I am using to group the data.

Once grouped in the calling job I pass the groups to the shared container.

In the shared container I'm evaluating fld_1 and fld_2. For a group when fld_1 is "1", there must be a "X", "z", and "r" in fld_2 across that group and send back to the calling program a single record. There is similar logic when fld_1 is "2" or "3".

When I have one group in the input file of the call job it works fine. When I have multiple groups in the input file, all the records are evaluated together in the container, instead of one group at a time.

Thanks.

Key, fld_1, fld_2
=-=-=-=-=-=-=-=-=
AAA, 1, x
AAA, 1, y
AAA, 1, z
AAA, 1, r

BBB, 2, d
BBB, 2, f
BBB, 2, g

CCC, 3, h
CCC, 3, j
CCC, 3, k

chulett · Post by **chulett** » Fri Dec 06, 2013 12:28 pm

Describe the contents of the shared container - what stages are there and what functions are they performing? Off the top of my head, this just looks like a Transformer using stage variables to do 'group change detection', perhaps supported by a Sort stage adding a Key Change column. Each time the group changes or when you hit EOD pass out your group result... whatever that is.

tbtcust · Post by **tbtcust** » Fri Dec 06, 2013 12:45 pm

Hello chulett,

There is a transformer that receives the groups and performs the evaluations. and a couple of joins to create the output.

pavi · Post by **pavi** » Fri Dec 06, 2013 1:31 pm

I believe this is a partitioning issue.When the data is send to the shared container it is grouped as per your need but when it gets into shared container,it is getting re partitioned thus affecting your group logic.If it is so it is better to use a same partition in input column of shared container so that re partition doesnt happen.but keep in mind that it all depends on what operations you are performing in shared container.if there is a operation which involves key change then it is inevitable to escape re partition.

chulett · Post by **chulett** » Fri Dec 06, 2013 4:48 pm

It would help if you explained your 'grouping evaluation' logic in the transformer.

Assuming this shared container is meant to be used in multiple jobs then I would suggest you make no assumptions about how the incoming data is arriving. Sort it by group first, add a Key Change column to ease your grouping logic in the transformer and use hash partitioning so that your groups stay together... unless you force the transformer to run sequentially or the job always runs on a single node.

tbtcust · Post by **tbtcust** » Fri Dec 13, 2013 2:07 am

Thank you all.

I thought through what bart12872 and pavi wrote and redesigned the approach to the shared container.

I am treating the container as a job where I'm using change key in sorts, LastRowInGroup(), and control break logic.

chulett · Post by **chulett** » Fri Dec 13, 2013 8:04 am

Glad I could help.