FTP Stage putting spaces in some fields

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

FTP Stage putting spaces in some fields

Post by poorna_76 »

Hi,

We are trying to do a FTP from MainFrame to DataStage server.

FTPStage --->Transformer ---->SequentailStage.

We have set the following properties:

RemotePath (left blank)
RemoteFileName #parmFileName#
DataRepresentation Ascii
check data against meta data no
line termination CR/LF
Fixedwidth columns Yes
Space between columns 0
Column Delimiter (left blank)
Quote character (left blank)
Escape character \
Null string (left blank)
Firstline columnnames No
TelnetBefore (left blank)
Telnet After (left blank)
FTP data connection mode active
FTP data port (left blank)
Link Tracing Level 4
Buffer Length 4096

Though i have 136 columns, i am reading(doing FTP) as single column with record length 600.

The FTP is successful without any warnings.

But i when use the Text file(FTP ed by datastage FTP stage),
in some of the records there is no actual data , but there are spaces.

Let us say in some record 1150, from column 5 to column 136
everything is space.

It followed the same pattern , from column 5 to column 136,
everything is space (FOR THE INCORRECT RECORDS)

It happened for say 10- 15 records, out of 500,000 records

Can anybody through some idea why this is happening.

Thanks in Advance.
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

There can be several reasons. Some are
1.) Lines contain special characters - such as Unix kill character - that restricts the display
2.) Record obtained from Data file which had an associated index. By default, COBOL enters null values over the record it deletes. This is generally corrected by programs that read all live records out and rebuild them again.
3.) Maybe due to network glitch during download
4.) Maybe some embeded ^M character appeared in the data file which DOS / Windows interpret as line break
5.) If you landed the data in Unix, you may have to obtain with Unix delimiter.

Obviously this is not the exhaustive list.
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Post by poorna_76 »

Sainath.Srinivasan wrote:There can be several reasons. Some are
1.) Lines contain special characters - such as Unix kill character - that restricts the display
2.) Record obtained from Data file which had an associated index. By default, COBOL enters null values over the record it deletes. This is generally corrected by programs that read all live records out and rebuild them again.
3.) Maybe due to network glitch during download
4.) Maybe some embeded ^M character appeared in the data file which DOS / Windows interpret as line break
5.) If you landed the data in Unix, you may have to obtain with Unix delimiter.

Obviously this is not the exhaustive list.
I am on Windows Server.

But the same DataFile FTPed using other than DataStage(FTP stage),
does not have problems.

Is FTP stage reliable?
Sainath.Srinivasan
Participant
Posts: 3337
Joined: Mon Jan 17, 2005 4:49 am
Location: United Kingdom

Post by Sainath.Srinivasan »

The DataStage ftp stage calls the underlying FTP connection between your server and the remote machine. Hence it must be reliable.

Can you please verify whether you are extracting the same file via DataStage as you did outside it. Note that some people are provided a default ftp directory which is different from the root.

Also can you find the difference in the characters passed using the 2 mechanism. This will give an idea of what is missing so we can have some lead to trace.
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Hello Poorna,

try to set the ftp to "binary" mode -> that way there will be no implicit conversion done by ftp. This can happen with binary fields (i.e. COBOL COMP-3) managing to hit the compressed space sequence and generating what may look like a random amount of whitespace.

I also noticed that you had a fixed column length plus line terminators - if there is a mismatch here you might be offsetting all of your columns. In addition, what happens when you do tell the FTP stage to check the data against the meta data (you should get no errors in your case)
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Sainath.Srinivasan wrote:The DataStage ftp stage calls the underlying FTP connection between your server and the remote machine. Hence it must be reliable.
No, not really. There's one enormous difference between command line ftp and the FTP stage - the stage is metadata driven. It tries to process 'records' whereas ftp just streams bytes. It's why several people who hang out here won't touch it with a 10 foot pole or consider it all that reliable. :wink:

I've also found it to have several unique 'issues' that don't exist with command line ftp. I've used it successfully to send files, for example, but personally wouldn't consider using it as a 'source' in a job, even though it certainly can be.

Not all that helpful for the OP, I know, but wanted to make that point.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

chulett wrote:
Sainath.Srinivasan wrote:The DataStage ftp stage calls the underlying FTP connection between your server and the remote machine. Hence it must be reliable.
I got a chuckle from Sainath's comment as well; but I think it just came out differently from what he'd intended. I have made a hobby in the past years of baiting the 4.1.9 Scammers (you know, money from Nigerian bank accounts, oil deals, gold mines and the like) and recently I asked one of them in a mail "can I really, really, really trust you?" and the scammers response was "I am a lawyer and barrister, so I cannot lie and you must trust me". Except unlike Sainath's line this guy actually thought I'd believe him.

Back to FTP - I noticed a number of inconsistencies as well between what the DS component does and how the command-line FTP works, for the same reasons as chulett mentioned: Metadata. I drove myself crazy once spending days trying to figure out how to get a .gzip compressed file into DS using FTP and finally gave up and used a shell script.
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Post by poorna_76 »

ArndW wrote:
chulett wrote:
Sainath.Srinivasan wrote:The DataStage ftp stage calls the underlying FTP connection between your server and the remote machine. Hence it must be reliable.
I got a chuckle from Sainath's comment as well; but I think it just came out differently from what he'd intended. I have made a hobby in the past years of baiting the 4.1.9 Scammers (you know, money from Nigerian bank accounts, oil deals, gold mines and the like) and recently I asked one of them in a mail "can I really, really, really trust you?" and the scammers response was "I am a lawyer and barrister, so I cannot lie and you must trust me". Except unlike Sainath's line this guy actually thought I'd believe him.

Back to FTP - I noticed a number of inconsistencies as well between what the DS component does and how the command-line FTP works, for the same reasons as chulett mentioned: Metadata. I drove myself crazy once spending days trying to figure out how to get a .gzip compressed file into DS using FTP and finally gave up and used a shell script.
Thanks for all of your FeedBacks.

I am using the same file both for FTP using DataStage and other than DataStage.

Just now i tried,
CheckAgainstmetadata - Yes.
LineTerminator CR/LF


The i got the following message:
FTPJobName..stgXFTP: Termination Character(s) not at expected location (row 1).


Thanks
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Folks, remember, the FTP stage is a READER, it reads the data file and sends data as rows across the network connection. Command line FTP doesn't CARE about the file, it transfer it in BLOCKS of data.

You should NOT use the FTP stage as a file mover. This is like using a hammer to put screws into wood. It's the wrong tool for the job. If you need to move files, use command line FTP. If you need to read a remote file without moving it closer, then use the FTP stage.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Post by poorna_76 »

kcbland wrote:Folks, remember, the FTP stage is a READER, it reads the data file and sends data as rows across the network connection. Command line FTP doesn't CARE about the file, it transfer it in BLOCKS of data.

You should NOT use the FTP stage as a file mover. This is like using a hammer to put screws into wood. It's the wrong tool for the job. If you need to move files, use command line FTP. If you need to read a remote file without moving it closer, then use the FTP stage.

Can you please give us some idea on,
how to do command line FTP.

We are Windows Server.

Thanks in Advance.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Write a .bat file that invokes ftp. If you have a scripting language or interpreter like MKS toolkit, you can use that. Or, you can write a DS BASIC Batch job and use DS to do the FTP, all you'll do is run FTP using the DSExecute API.

Simply open a DOS prompt box and type ftp, you'll see that it is available under Windoze.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Post by poorna_76 »

kcbland wrote:Write a .bat file that invokes ftp. If you have a scripting language or interpreter like MKS toolkit, you can use that. Or, you can write a DS BASIC Batch job and use DS to do the FTP, all you'll do is run FTP using the DSExecute API.

Simply open a DOS prompt box and type ftp, you'll see that it is available under Windoze.
We just found that's due to LowValue(HexaDecimal value) in the MainFrame Data.

When it encounters LowValue it is ignoring the rest of the data and not writting anything to that record( putting spaces) .

How can we overcome this?

Is there a way we can handle LowValue during the FTP transfrer?

Thanks in Advance.
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Yes, do a command line FTP in binary and you get a 100% bit-perfect match of the file. Anything else is open to interpretation by the processes conducting the transfer.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
poorna_76
Charter Member
Charter Member
Posts: 190
Joined: Thu Jul 08, 2004 10:42 am

Post by poorna_76 »

kcbland wrote:Yes, do a command line FTP in binary and you get a 100% bit-perfect match of the file. Anything else is open to interpretation by the processes conducting the transfer.
Thanks Kenneth for your valuable opinions.


What about the FTP from DataStage Server to MainFrame?

We have some jobs which FTP text files from DataStage Server to MainFrame.

We don't have issues with that process till now.

But still, is that Reliable or not?

Thanks in Advance
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

poorna_76 wrote: But still, is that Reliable or not?
I know English is not a native language for some members of this forum, so I'm not trying to be picky.

Reliable can mean different things. A poorly written program can be reliable, it always does the right thing, but takes too long or is difficult to read. A well written program doing something stupid can also be reliable, it just may not be the best way to do something. A space shuttle with 2 million parts and a 1 billion dollar expense per launch is NOT reliable, as evidenced by losing 2 out of 6 ships and less than 200 launches, in spite of monumental human engineering efforts.

So, the underlying FTP technology is very reliable. It is not the issue with the FTP stage, the problem is the parsing of data into rows of columns with specific widths and allowable characters. So, if you wanted to move a 200 million row file, you'd rather use command line FTP because it doesn't care about the insides of the file, it can binary transfer it in blocks, allowing huge throughput.

I've had this argument a number of times on the forum, I don't care to repeat previous posts. The most important thing to remember is that if you are reading a remote file via the FTP stage and fail half way thru, you must START AT THE BEGINNING OF THE FILE AGAIN. You have no easy way to "skip ahead" in the file, you will incur all network traffic again to reach the row you failed on and then continue processing. This is a fundamental BAD DESIGN, I don't care how well you programmed it.

If you have a large remote file, move it local using command line FTP. Then, slice it into manageable chunks and process it. If you fail, at least you don't incur the network traffic again. When you're done, blow away the file if you want.

As far as transfering files YOU'VE CREATED, you can manage the quality of the content and then should be okay using the stage, provided you understand the issues. But, fetching a remote file puts you at the mercy of whoever/whatever created the file, hopefully correctly everytime.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply