Page 1 of 1

Information Analyzer:: Analysis of Files and Database Tables

Posted: Tue Apr 08, 2008 3:54 am
by dsusr
Hi All,

I have a requirement to do the complete profiling ( Column Analysis, Table Analysis, Primary Key Analysis, Foreign Key Analysis and Cross Domain Analysis ) of customer data which is provided in Flat files using Information Analyzer. To Start with I have following few questions:-

1) Can we perform all these analysis using Flat Files or we have to load the data into the tables to perform all these analysis ?
2) After reading the Documentation for Information Analyzer it seems we have to create the DSN to flat files. Can anyone please provide me the steps to create the DSN to connect to Flat files?

Any help would be highly appreciated.

Also if anyone can provide me the steps to start profiling of Data specially starting with the set-up information would be of great help to me.

Regards,
dsusr

Posted: Tue Apr 08, 2008 4:21 am
by ray.wurlod
1. You can do it all from flat files.

2. Use the ODBC driver for text files. This is fairly easy to configure - for a given DSN you point it at the directory that contains the file and optionally specify a file name suffix (for example "*.txt" or "*.csv") to filter the files that can be processed.

Posted: Tue Apr 08, 2008 4:48 am
by dsusr
Thanks Ray for providing this information but I have some questions on DSN creation.


Currently my Analysis server is on HP-UX and the files are present on my Windows XP machine which contains the Information Server Console. Now at one point in the manual it is written that the DSN for for source DB ( which in my case would be flat files ) need to be created on machine that is running the Metadata server ( which in my case is HP-UX ).

Now my concern is how can I create a DSN on my HP-UX to connect to the files which are present on my machine ? or do I need to move the files on HP-UX machine ?

TIA
Regards,
dsusr

Posted: Tue Apr 08, 2008 4:57 am
by ray.wurlod
It would certainly be easier to move the files.

You can leave them on the Windows machine, but you would need to make the folder in which they live visible to the HP-UX machine, for example using old technology such as Samba, or newer technology such as LDAP.

Just out of curiosity, where in this mix is your ProfileStage database (PSDB)?

Posted: Tue Apr 08, 2008 5:23 am
by Dev_India
Sorry Ray to jump in between but would like to know one small thing--

1. If the files are lying on the Same unix machine which have the analysis server and analysis DB then is there any need to create a DSN to connect to the files and if yes then can you please tell how to create a dsn on a unix machine.

Thanks ---- dev

Posted: Tue Apr 08, 2008 5:34 am
by dsusr
Hi Ray I would prefer to move the files to a unix machine instead of using the samba or LDAP technology as I dont have strong understanding of both these technologies.

Currently my PSDB or IADB is on the same machine that have analysis server i.e. my HP-UX machine.

I was just going through one of the previous post and got confused with adding enteries in the odbc.ini and QETXT.INI. Can I just assume that I can start the analysis without adding these enteries or is this mandatory to have the enteries for each and every file in QETXT.ini file ?

TIA
Regards,
dsusr

Posted: Tue Apr 08, 2008 11:13 am
by lstsaur
dsusr,
Yes, you must define each file in your QETXT.ini file. Also, you have to apply the Rollup10 patch.

Posted: Tue Apr 08, 2008 1:02 pm
by WDWolf
One additional question/thought on the Samba vs. copy issue. I am familiar with older profilestage versions. When we did database connections or ODBC to profile data we also set the same connections up at the client side to enable the drill-through functions to query or sample the data through the product. With the latest version of Information Server does this functionality still work the same? Is there still the need to have the connections from both the information analysis server and the client as well? I left that gig before we upgraded and started doing sequential files, but the theory was that the Samba route would allow connection to a single copy of the data without persisting in multiple locations. What are the thought on this slight twist?

Posted: Wed Apr 09, 2008 2:29 am
by dsusr
lstsaur wrote:dsusr,
Yes, you must define each file in your QETXT.ini file. Also, you have to apply the Rollup10 patch.
Thanks for the reply. But Can you please tell me what is this Rollup10 patch?

Regards,
dsusr

Posted: Wed Apr 09, 2008 3:16 am
by ray.wurlod
It's a patch (set of patches?) from the vendor containing ten smaller sets of fixes, all in one application.

Without it you will get very strange behaviours - and occasional non-behaviours - from Information Analyzer.