Page 1 of 1

Reading Multiple files using File Pattern

Posted: Sat Dec 26, 2009 12:33 pm
by dspradeep
I have created simple file name called outputA.txt which contain data

Code: Select all

name,desc
x,manager
y,clerk
z,security
again make copy of this file and changed the name value as p,q,r and rename the file name as outputB.txt

Code: Select all

name,desc
p,manager
q,clerk
r,security
In the sequntial file stage, when I am trying to use file pattern to view the data I coundn't. After that I have searched this forum and find some solution like we need to change this Env var value as true

Code: Select all

$APT_IMPORT_PATTERN_USES_FILESET = True 
I did this, after that also i couldn't view. the below error i am getting when i try to view

Code: Select all

##W TOIX 000000 23:51:01(000) <main_program> createFilesetFromPattern(): Couldn't find any files on host  with pattern D:/Datastage/DX444/ISFiles/Temp/*.txt. [new-impexp\file_import.C:1647]
>##E TOIX 000138 23:51:01(002) <Sequential_File_0> At least one filename or data source must be set in APT_FileImportOperator before use. [new-impexp\file_import.C:2029]
>##E TFSR 000019 23:51:01(006) <main_program> Could not check all operators because of previous error(s) [api\step_rep.C:1128]
 ##I TCOS 000022 23:51:01(007) <main_program> Explanation:

Posted: Sat Dec 26, 2009 1:39 pm
by ray.wurlod
You don't need File Pattern to read one file. (The file name could be a job parameter.) However, it will work to use File Pattern, provided that the pattern matches one or more file names. It would appear that your pattern does not (or lacks a directory component to the path). Please advise the precise file pattern value that you are using.

Posted: Sat Dec 26, 2009 3:14 pm
by chulett
That $APT variable helps when you use the 'give me the filename that was matched' option. Right now, it would seem that your wildcard pattern doesn't match any filenames.

Posted: Sat Dec 26, 2009 11:36 pm
by dspradeep
I agree whether one or two file doesn't matter for file pattern so it seems the file pattern option is not working. why it's not working whether any $APT file settings need to change or something else.........

Please help me

Posted: Sun Dec 27, 2009 12:37 am
by ray.wurlod
Be more specific about what "not working" means. I have no problem with it. Tell us what your property settings are, what output you expect to get, what the file names actually are, and what output you are actually getting.

Posted: Sun Dec 27, 2009 3:56 am
by dspradeep
There are two files are available in the below location
D:\Datastage\DX444\ISFiles\Temp
File name : outputA.txt,outputB.txt
File Structure is same for both files. The file content also I given in my first post.
I want to load this both data into another sequential file using file patern option so that I have given below properties in sequential file and trying to view the data but i couldn't

File Pattern:D:\Datastage\DX444\ISFiles\Temp\output*.txt
Read Method : FilePattern

I have loaded Imported the metadata in column tab

trying to view data using ViewData button

Code: Select all

##I TFCN 000001 15:23:27(000) <main_program> 
 Ascential DataStage(tm) Enterprise Edition 7.5
 Copyright (c) 2004, 1997-2004 Ascential Software Corporation.
 All Rights Reserved
 
 
 ##I TOSH 000002 15:23:27(001) <main_program> orchgeneral: loaded
 ##I TOSH 000002 15:23:27(002) <main_program> orchsort: loaded
 ##I TOSH 000002 15:23:27(003) <main_program> orchstats: loaded
 ##W TCOS 000049 15:23:27(004) <main_program> Parameter specified but not used in flow: DSProjectMapName [osl\osl.C:651]
 ##I TCOS 000021 15:23:27(005) <main_program> Echo:
 import
 -schema record
   {final_delim=end, delim=','}
 (
   name:string[max=255];
   desc:string[max=255];
 )
 -rejects continue
 -reportProgress yes
 -filepattern 'D:\\Datastage\\DX444\\ISFiles\\Temp\\output*.txt'
 
 [ident('Sequential_File_0'); jobmon_ident('Sequential_File_0')]
 0> [] 'Sequential_File_0:DSLink2.v'
 ;
 
 head -nrecs 10
 [ident('_Head'); jobmon_ident('_Head'); seq]
 0< 'Sequential_File_0:DSLink2.v'
 0> [] 'Sequential_File_0:DSLink2_Head.v'
 ;
 
 peek -all -delim '%%PEEK%%DELIM%%' -field name -field desc
 [ident('_PEEK_IDENT_'); jobmon_ident('_PEEK_IDENT_'); seq]
 0< 'Sequential_File_0:DSLink2_Head.v'
 0> [] 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'
 ;
 
 abort -nrecs 10
 [ident('_ABORT_IDENT_'); jobmon_ident('_ABORT_IDENT_'); seq]
 0< 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'
 ;
 
 
 
 ##I TFSC 000001 15:23:28(004) <main_program> APT configuration file: C:/Ascential/DataStage/Configurations/default.apt
 ##I TFSC 000000 15:23:28(005) <main_program> 
 This step has no datasets.
 
 It has 1 operator:
 op0[1p] {(sequential APT_CombinedOperatorController:
       (Sequential_File_0)
       (_Head)
       (_PEEK_IDENT_)
       (_ABORT_IDENT_)
     ) on nodes (
       node1[op0,p0]
     )}
 It runs 1 process on 1 node.
 ##I TCOS 000022 15:23:28(006) <main_program> Explanation:
 Step has 4 operators.
 ???, invoked with args: -schema record {final_delim=end, delim=","} ( name: string[max=255];  desc: string[max=255]; ) -rejects continue -reportProgress yes -filepattern D:\Datastage\DX444\ISFiles\Temp\output*.txt 
     output port 0 bound to data entity "Sequential_File_0:DSLink2.v"
 
 ???, invoked with args: -nrecs 10 
     input port 0 bound to data entity "Sequential_File_0:DSLink2.v"
     output port 0 bound to data entity "Sequential_File_0:DSLink2_Head.v"
 
 ???, invoked with args: -all -delim %%PEEK%%DELIM%% -field name -field desc 
     input port 0 bound to data entity "Sequential_File_0:DSLink2_Head.v"
     output port 0 bound to data entity "Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v"
 
 ???, invoked with args: -nrecs 10 
     input port 0 bound to data entity "Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v"
 
 Step has 3 data entities.
   Data "Sequential_File_0:DSLink2.v"(an Orchestrate data set)
     written by operator "Sequential_File_0"
     read by operator "_Head"
 
   Data "Sequential_File_0:DSLink2_Head.v"(an Orchestrate data set)
     written by operator "_Head"
     read by operator "_PEEK_IDENT_"
 
   Data "Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v"(an Orchestrate data set)
     written by operator "_PEEK_IDENT_"
     read by operator "_ABORT_IDENT_"
 
 
 ##I TCOS 000023 15:23:28(007) <main_program> Dump:
 { 
   text="import\r\n-schema record {final_delim=end, delim=\",\"} ( name: string[max=255];  desc: string[max=255]; )\r\n-rejects continue\r\n-reportProgress yes\r\n-filepattern 'D:\\\\Datastage\\\\DX444\\\\ISFiles\\\\Temp\\\\output*.txt'\r\n\r\n[ident('Sequential_File_0'); jobmon_ident('Sequential_File_0')]\r\n0> [] 'Sequential_File_0:DSLink2.v'\r\n;\r\n\r\nhead -nrecs 10\r\n[ident('_Head'); jobmon_ident('_Head'); seq]\r\n0< 'Sequential_File_0:DSLink2.v'\r\n0> [] 'Sequential_File_0:DSLink2_Head.v'\r\n;\r\n\r\npeek -all -delim '%%PEEK%%DELIM%%' -field name -field desc\r\n[ident('_PEEK_IDENT_'); jobmon_ident('_PEEK_IDENT_'); seq]\r\n0< 'Sequential_File_0:DSLink2_Head.v'\r\n0> [] 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'\r\n;\r\n\r\nabort -nrecs 10\r\n[ident('_ABORT_IDENT_'); jobmon_ident('_ABORT_IDENT_'); seq]\r\n0< 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'\r\n;", 
   line=1, column=1, name="", qualname="", 
   op={ 
        text="import\r\n-schema record {final_delim=end, delim=\",\"} ( name: string[max=255];  desc: string[max=255]; )\r\n-rejects continue\r\n-reportProgress yes\r\n-filepattern 'D:\\\\Datastage\\\\DX444\\\\ISFiles\\\\Temp\\\\output*.txt'\r\n\r\n[ident('Sequential_File_0'); jobmon_ident('Sequential_File_0')]\r\n0> [] 'Sequential_File_0:DSLink2.v'", 
        line=1, column=1, name=import, qualname=Sequential_File_0, 
        wrapout={},
        wrapperfile=import, kind=non_wrapper_cdi_op, exec_mode=none, 
        args="'record {final_delim=end, delim=\",\"} ( name: string[max=255];  desc: string[max=255]; )'-rejects'continue'-reportProgress'yes'-filepattern'D:\\\\Datastage\\\\DX444\\\\ISFiles\\\\Temp\\\\output*.txt'", 
        output={ text="\r\n0> [] 'Sequential_File_0:DSLink2.v'", line=13, 
                 column=1, name="", qualname="Sequential_File_0[o0]", 
                 data="Sequential_File_0:DSLink2.v"
               }
      },
   op={ 
        text="\r\n\r\nhead -nrecs 10\r\n[ident('_Head'); jobmon_ident('_Head'); seq]\r\n0< 'Sequential_File_0:DSLink2.v'\r\n0> [] 'Sequential_File_0:DSLink2_Head.v'", 
        line=16, column=1, name=head, qualname=_Head, 
        wrapout={},
        wrapperfile=head, kind=non_wrapper_cdi_op, exec_mode=seq, 
        args="'10'", 
        input={ text="\r\n0< 'Sequential_File_0:DSLink2.v'", line=18, 
                column=1, name="", qualname="_Head[i0]", 
                data="Sequential_File_0:DSLink2.v"
              },
        output={ text="\r\n0> [] 'Sequential_File_0:DSLink2_Head.v'", 
                 line=19, column=1, name="", qualname="_Head[o0]", 
                 data="Sequential_File_0:DSLink2_Head.v"
               }
      },
   op={ 
        text="\r\n\r\npeek -all -delim '%%PEEK%%DELIM%%' -field name -field desc\r\n[ident('_PEEK_IDENT_'); jobmon_ident('_PEEK_IDENT_'); seq]\r\n0< 'Sequential_File_0:DSLink2_Head.v'\r\n0> [] 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'", 
        line=22, column=1, name=peek, qualname=_PEEK_IDENT_, 
        wrapout={},
        wrapperfile=peek, kind=non_wrapper_cdi_op, exec_mode=seq, 
        args="'-delim'%%PEEK%%DELIM%%'-field'name'-field'desc'", 
        input={ text="\r\n0< 'Sequential_File_0:DSLink2_Head.v'", line=24, 
                column=1, name="", qualname="_PEEK_IDENT_[i0]", 
                data="Sequential_File_0:DSLink2_Head.v"
              },
        output={ 
                 text="\r\n0> [] 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'", 
                 line=25, column=1, name="", qualname="_PEEK_IDENT_[o0]", 
                 data="Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v"
               }
      },
   op={ 
        text="\r\n\r\nabort -nrecs 10\r\n[ident('_ABORT_IDENT_'); jobmon_ident('_ABORT_IDENT_'); seq]\r\n0< 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'", 
        line=28, column=1, name=abort, qualname=_ABORT_IDENT_, 
        wrapout={},
        wrapperfile=abort, kind=non_wrapper_cdi_op, exec_mode=seq, 
        args="'10'", 
        input={ text="\r\n0< 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'", 
                line=30, column=1, name="", qualname="_ABORT_IDENT_[i0]", 
                data="Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v"
              }
      },
   data={ text="\r\n0> [] 'Sequential_File_0:DSLink2.v'", line=13, column=1, 
          name="Sequential_File_0:DSLink2.v", 
          qualname="Sequential_File_0:DSLink2.v", 
          partwrapout={},
          collwrapout={},
          dir=flow, kind=ds, writer=Sequential_File_0, reader=_Head, pp=none, 
          trunc=default, ident="Sequential_File_0:DSLink2.v"
        },
   data={ text="\r\n0> [] 'Sequential_File_0:DSLink2_Head.v'", line=19, 
          column=1, name="Sequential_File_0:DSLink2_Head.v", 
          qualname="Sequential_File_0:DSLink2_Head.v", 
          partwrapout={},
          collwrapout={},
          dir=flow, kind=ds, writer=_Head, reader=_PEEK_IDENT_, pp=none, 
          trunc=default, ident="Sequential_File_0:DSLink2_Head.v"
        },
   data={ text="\r\n0> [] 'Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v'", 
          line=25, column=1, 
          name="Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v", 
          qualname="Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v", 
          partwrapout={},
          collwrapout={},
          dir=flow, kind=ds, writer=_PEEK_IDENT_, reader=_ABORT_IDENT_, 
          pp=none, trunc=default, 
          ident="Sequential_File_0:DSLink2_Head_PEEK_IDENT_.v"
        }
 }
 
 ##W TOIX 000000 15:23:29(000) <Sequential_File_0,0> Couldn't find any files on host PRADEEP-0320510 with pattern D:/Datastage/DX444/ISFiles/Temp/output*.txt. [new-impexp\file_import.C:2624]
 ##I TOIX 000000 15:23:29(001) <Sequential_File_0,0> Output 0 produced 0 records.
 ##I USER 000000 15:23:29(002) <_Head,0> Input 0 consumed 0 records.
 ##I USER 000000 15:23:29(003) <_Head,0> Output 0 produced 0 records.
 ##I TOPK 000000 15:23:29(004) <_PEEK_IDENT_,0> Input 0 consumed 0 records.
 ##I TOPK 000000 15:23:29(005) <_PEEK_IDENT_,0> Output 0 produced 0 records.

Posted: Sun Dec 27, 2009 8:04 am
by chulett
Well... out of all that, this is the only bit that really matters:

Code: Select all

 ##W TOIX 000000 15:23:29(000) <Sequential_File_0,0> Couldn't find any files on host PRADEEP-0320510 with pattern D:/Datastage/DX444/ISFiles/Temp/output*.txt. [new-impexp\file_import.C:2624] 
Silly question perhaps, but are these files on your DataStage server rather than your client PC? I can't tell for certain from your post. Can you successfully read one or both (add two 'File=' properties) using the Specific File(s) option? As a test, does it work if you change the "output*.txt" portion to simply "*.txt"?

Posted: Sun Dec 27, 2009 10:43 am
by dspradeep
Can you successfully read one or both (add two 'File=' properties) using the Specific File(s) option?

yes both file i can able to view if I use the ReadMethod(s) as Specific File(s). But if I use ReadMethod(s) as File Pattern and given file name as
D:\Datastage\DX444\ISFiles\Temp\*.txt also not working.

does it work if you change the "output*.txt" portion to simply "*.txt"?
No

Posted: Sun Dec 27, 2009 12:47 pm
by Sreenivasulu
You want to see data of OutputA.txt OR OutputB.txt ?Can you tell me ?
I think going by your question you want to see the data of both the files using a single command. It does not work like that way.
The concept of 'file pattern' to load mutilple files with same pattern.
I don't think you can put a 'view data' using a 'pattern'. Datastage would
require to apply 'fuzzy logic' to understand this patttern and to apply to one of the file.

Posted: Sun Dec 27, 2009 4:30 pm
by ray.wurlod
Try these two techniques.
  • Use forward slashes in the pathname rather than backslashes.

    Use a UNIX-style pathname that does not include a drive letter.

Posted: Sun Dec 27, 2009 5:20 pm
by chulett
Sreenivasulu wrote:I don't think you can put a 'view data' using a 'pattern'. Datastage would require to apply 'fuzzy logic' to understand this patttern and to apply to one of the file.
Nothing fuzzy about it and of course you can.

Posted: Sun Dec 27, 2009 10:19 pm
by dspradeep
I think you are right srini. I try to load both file into another sequential file using file pattern which is loaded the data perfectly. so we can't view the data using file pattern in sequential file stage.

I am facing one more problem now. i am having both file have field name (header record) but i need to load only one time into target. is there any option to drop header record apart from first file.

Code: Select all


name,desc
x,manager
y,clerk
z,security
name,desc
p,manager
q,clerk
r,security



Posted: Mon Dec 28, 2009 3:57 am
by ray.wurlod
No. You have to design that piece, perhaps by looking for the explicit column header values (since your columns are all defined as sufficiently large VarChar to handle the column headings, no row will be rejected).

Posted: Mon Dec 28, 2009 7:56 am
by chulett
dspradeep wrote:I think you are right srini. I try to load both file into another sequential file using file pattern which is loaded the data perfectly. so we can't view the data using file pattern in sequential file stage.
As already noted, this is completely incorrect. You can. :?

Posted: Mon Dec 28, 2009 11:39 am
by Sreenivasulu
It was my logical assumption and it might not be correct. Thanks craig for the clarification.