Hi,
It is required to dynamically pick columns (headers) from a particular file based on some condition. Again the file name is also dynamic ( "xyz"+ID+MMDDYY). ID and MMDDYY will keep on changing. I have to pick ID from the file name and attach it to the selected columns.
For example :
File Name : xyz_123_102803 (ID = 123 and date 10/28/03)
Columns : abc_I, sdf_I, sde_D (headers are dynamic i.e. in the next file we will have different headers)
I have to select abc_I and sdf_I columns.
Final output will be a file with following header names :
123abc_I, 123sdf_I (123 being the ID)
I was wondering if it is possible in Datastage PX. If yes, then how can we achieve this? There are so many questions arising here. How to pick a file with dynamic name (File path is fixed)? Then how can we specify dynamic column names in the file format? How can we select the columns based on column name format? How can we pick ID from the filename and attach the same to the columns?
Any help in this regard is highly appreciated.
Thanks a Lot in advance
Best Regards
Pinkesh
Dynamic file dynamic column
Moderators: chulett, rschirm, roy
Hi,
I might have an idea but you'll have to try if it works.
since your talking dynamic names you might get it if the number of columns is fixed or at least is a finite number of options (prefered small number).
1. use parameter for file name.
2. build a routine that will build the target field names depending on your file name in the format of "abc_I, sdf_I, sde_D".
* the above is in a seperate job
which will pass the string of column names as a parameter to the next one
3. build a standard table definition with col1,col2,..... to corespond to the number of columns you have and use a user defined query that will do the insert itdelf using the parameter column names for insert.
i.e. insert into #TNAME#(#COLLIST#) values (?,?....)
since there is no real need that the fixed table definition must have the final column names this should work.
the only down side is that you need a job for every number of columns you might have and use the proper one for each run ( this could be done with a controll job using DS basic.
you did mention parallel so worst case you might need to enclose the job with a shared container.
there might be some other issues but at least I gave an option (not simple but an option)
IHTH (I Hope This Helps)
I might have an idea but you'll have to try if it works.
since your talking dynamic names you might get it if the number of columns is fixed or at least is a finite number of options (prefered small number).
1. use parameter for file name.
2. build a routine that will build the target field names depending on your file name in the format of "abc_I, sdf_I, sde_D".
* the above is in a seperate job
which will pass the string of column names as a parameter to the next one
3. build a standard table definition with col1,col2,..... to corespond to the number of columns you have and use a user defined query that will do the insert itdelf using the parameter column names for insert.
i.e. insert into #TNAME#(#COLLIST#) values (?,?....)
since there is no real need that the fixed table definition must have the final column names this should work.
the only down side is that you need a job for every number of columns you might have and use the proper one for each run ( this could be done with a controll job using DS basic.
you did mention parallel so worst case you might need to enclose the job with a shared container.
there might be some other issues but at least I gave an option (not simple but an option)
IHTH (I Hope This Helps)
Roy R.
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Time is money but when you don't have money time is all you can afford.
Search before posting:)
Join the DataStagers team effort at:
http://www.worldcommunitygrid.org
Dynamic file dynamic column
Another option is to use an outside UNIX script to pick out the ID and generate an Orchestrate Schema File prior to record processing. Orchestrate schema files are PX-specific metadata files that specify record- and field- related layout and type properties. Schema files can be imported into DataStage as Table Definitions. Also, I believe that you can still call PX stages by the old Orchestrate name/properties style, similar to what you see in score dumps and the generated osh.
E.g.
import
-schemafile GeneratedFile
-file NameOfInput
|
..downstream processing..
-BP
E.g.
import
-schemafile GeneratedFile
-file NameOfInput
|
..downstream processing..
-BP
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
I suggest you start with a sequence job that runs a routine that searches the directory for a file and returns the file name as Ans back to the sequence job. The sequence job then calls a server job and passes it three job parameters:
The server job uses #FileName# as the input file. You can build any output file name you want using a combination of job parameters.
Now when it comes to column headings you can just treat the first row, the heading row, as a normal record and output it in the transformer with the ID added. Eg.
IF @INROWNUM=1 THEN FileID:FieldName ELSE FieldName
This derivation will set a field to be transformed to 123abc_I if it is the first row being processed (a heading), or left unchanged if it is not a heading row. This approach could get a bit tricky depending on what format your data is in. You may need to process your headings seperately, it really depends what your data looks like.
If you are reading and writing from text files and you turn off "First row is column headings" you do not need to define accurate column names within DataStage column definitions. You can call your columns Larry, Curly and Moe (although abc_I and sdf_Imay make more sense). You can then transform your column names.
Code: Select all
PARAMETER DERIVIATION EXAMPLE
FileName GetFileName.ReturnedValue xyz_123_102803
FileID FIELD(GetFileName.ReturnedValue, "_", 2) 123
FileDate FIELD(GetFileName.ReturnedValue, "_", 3) 102803
Now when it comes to column headings you can just treat the first row, the heading row, as a normal record and output it in the transformer with the ID added. Eg.
IF @INROWNUM=1 THEN FileID:FieldName ELSE FieldName
This derivation will set a field to be transformed to 123abc_I if it is the first row being processed (a heading), or left unchanged if it is not a heading row. This approach could get a bit tricky depending on what format your data is in. You may need to process your headings seperately, it really depends what your data looks like.
If you are reading and writing from text files and you turn off "First row is column headings" you do not need to define accurate column names within DataStage column definitions. You can call your columns Larry, Curly and Moe (although abc_I and sdf_Imay make more sense). You can then transform your column names.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn