Page 1 of 1

Dataset management Utility shows different results

Posted: Sat Jan 07, 2006 6:23 am
by bala_135
Hello All,

I am using some transformations and saving the result in a Dataset.
I am mapping only few rows to the dataset(eg:25 rows only 10 rows to the dataset).When I viewing the data of the dataset it 's showing me only 10 rows but when I view the same in dataset Management utility it's showing all the columns(25 columns).Kindly let me know what's is problem.

Thank you in advance,
Bala.

Posted: Sat Jan 07, 2006 7:03 am
by ArndW
Bala,

did you mean to say "rows" instead of "columns" in your post? The View-Data facility in the designer only shows a certain number of rows (configurable) so you might not be seeing all of them in that view. The dataset management utility (which can also be called from the command line via "orchadmin") will show the correct number of rows, as will the entries in the job log file which writes to the dataset.

Posted: Sat Jan 07, 2006 3:22 pm
by ray.wurlod
Can you please post this question more clearly? It's perfectly possible to have ten rows and 25 columns.

Posted: Sun Jan 08, 2006 4:03 am
by bala_135
Hello All,

Sorry for the confusions.It's columns not rows.ie 10 columns in view data and in the dataset management utility it shows 25 columns.

REgards,
bala

Posted: Sun Jan 08, 2006 1:46 pm
by ray.wurlod
At least two possibilities then.

(1) You are not looking at the same Data Set. Check that the control file pathname is the same, and that APT_CONFIG_FILE has the same value.

(2) You have the column filter on in the data browser. I don't think it's this because it's not possible to open the data browser with the filter in effect (as far as I know).

Posted: Mon Jan 09, 2006 1:04 am
by bala_135
Hello All,

Kindly let me know where to find the control file path name.I have checked the APT_config_file it's prperly configured.

I have copied the script generated in the data set management utility on clicking on the output tab.

##I TFSC 000001 12:13:22(000) <main_program> APT configuration file: /tmp/aptoa103255f44e17
##I USER 000059 12:13:26(000) <APT_RealFileExportOperator in APT_FileExportOperator,0> Export complete. 30 records exported successfully, 0 rejected.
Name: D:\target\NewFinance.ds
Version: ORCHESTRATE V7.5.0 DM Block Format 6.
Time of Creation: 01/09/2004 12:04:03
Number of Partitions: 1
Number of Segments: 1
Valid Segments: 1
Preserve Partitioning: false
Segment Creation Time:
0: 01/09/2004 12:04:03

Partition 0
node : node1
records: 944
blocks : 7
bytes : 857152
files :
Segment 0 :
/C=/Ascential/DataStage/Datasets/NewFinance.ds.Administ.PROXY-2.0000.0000.0000.950.c1c7965b.0000.35a96aa2 917504 bytes
total : 917504 bytes

Totals:
records : 944
blocks : 7
bytes : 857152
filesize: 917504
min part: 917504
max part: 917504

Schema:
record
( ORIG_LOAN: decimal[8,2];
CLIENT: ustring[max=4];
ACCOUNT: ustring[max=30];
SEQ_NAME: ustring[max=60];
SEQ_ADDRESS: ustring[max=30];
SEQ_ADDRESS2: nullable ustring[max=30];
SEQ_CITY: ustring[max=30];
SEQ_STATE: ustring[2];
SEQ_ZIP: ustring[max=10];
SEQ_HOME_PHONE: nullable ustring[max=10];
SEQ_WORK_PHONE: nullable ustring[max=10];
ORIGINAL_LOAN: decimal[8,2];
PRINCIPAL_AMT: decimal[8,2];
FIRST_DUE_DATE: date;
PAYMENTS: int32;
PAYMENT_AMT: decimal[8,2];
BALANCE: decimal[8,2];
NEXT_DUE_DATE: date;
STATUS: ustring[1];
NAME: nullable ustring[max=60];
ADDRESS: nullable ustring[max=30];
ADDRESS2: nullable ustring[max=30];
CITY: nullable ustring[max=30];
STATE: nullable ustring[max=2];
ZIP: nullable ustring[max=10];
HOME_PHONE: nullable string[max=10];
WORK_PHONE: nullable string[max=10];
INSERTED: nullable timestamp;
UPDATED: nullable timestamp;
)
##I TFSR 000010 12:13:27(000) <main_program> Step execution finished with status = OK.

The schema which I have got above is by looking up a table and the schema which I want includes only the belwo mentioned metadata.

ORIG_LOAN: decimal[8,2];
CLIENT: ustring[max=4];
ACCOUNT: ustring[max=30];
ORIGINAL_LOAN: decimal[8,2];
PRINCIPAL_AMT: decimal[8,2];
FIRST_DUE_DATE: date;
PAYMENTS: int32;
PAYMENT_AMT: decimal[8,2];
BALANCE: decimal[8,2];
NEXT_DUE_DATE: date;
STATUS: ustring[1];


Any more suggesions.I am finding it difficult when I am using funnel later as so many warnings arises like dropping componenet SEQ_ADDRESS:


Thank you,
Bala.

Posted: Mon Jan 09, 2006 1:28 am
by ray.wurlod
The control file is the file whose name ends in ".ds" that you enter into the Data Set Management utility. In your case, this appears to be D:\target\NewFinance.ds

Its schema has the 30 columns that you showed in red. That's what's in the Data Set, you will need to find the job that populated the Data Set to verify this.

Your green schema is a subset of eight of these columns, though in a different order.

It's the same as if the eight columns were being selected from a table that has 30 columns in it.

View Data on the Data Set will show all 30 columns initially, because it uses the Data Set's schema (which is stored in the control file).

Posted: Mon Jan 09, 2006 3:54 am
by ameyvaidya
Hi Bala,

What is the status of RCP on the input of the dataset stage?

If RCP is on, Try this:
Switch off RCP at the output of the previous stage.
Delete the dataset and execute the job again..

Check the data in the dataset through the dataset management tool.

You should have only 8 columns of data in there.

Posted: Mon Jan 09, 2006 3:59 am
by thebird
Bala,

If you are mapping only a certain columns from the transformer to the Dataset, then you would need to switch off RCP on this link. If RCP is ON then, the metadata for the dataset created would consist of all 25 columns (as they are propagated to the next stage), even if only 10 columns are mapped to the dataset.

And if this dataset is used as a source with only 10 columns, then it would throw out "dropping componenet" warnings.

Hope this helps.

Regards,

The Bird.

Posted: Mon Jan 09, 2006 7:23 am
by bala_135
Hello AmeyVidhya/thebird,

It's working fine.Thank you very much.Kindly correct me if I am wrong in my understanding.Is this the case even if we keep seqential file as the target as the data's in link are virtual datasets eventhough there is no utility for that.

Regards,
Bala

Posted: Mon Jan 09, 2006 9:50 pm
by ameyvaidya
Unlike a dataset, in a sequential file , only the columns defined as being written to the file are written. The RCP'ed columns are silently dropped.

Posted: Mon Jan 09, 2006 11:35 pm
by bala_135
Thank you very much.I got it

REgards,
bala