Page 1 of 1
Need help with reading delimited text file
Posted: Wed Oct 30, 2013 4:53 pm
by mavrick21
Hello,
I have source text files (~ 50 of them) that are delimited by ||@@##
I tried this approach and it didn't work.
1) Replaced ||@@## with Hex value 7 using
sed 's/||@@##/\x7/g' source.txt > source.out
2) Put &H7 in Other Delimiter field in Import Metadata for sequential file.
3) Clicked on Preview but all fields showed up as just one field.
Please advise.
Thanks
-Mav
Posted: Wed Oct 30, 2013 5:27 pm
by chulett
Try "007" instead.
Posted: Wed Oct 30, 2013 5:52 pm
by ray.wurlod
Or &H07
Posted: Thu Oct 31, 2013 1:32 pm
by mavrick21
Tried both and they didn't work. Also tried few other hex (non-printable) values. In Import Meta Data(Sequential) I also tried toggling the NLS map values - None, UTF 8, ASCII and ISO-8859-1. Maybe I'm doing something wrong.
Here's a sample file (src.txt) for which I'm trying to import the metadata.
Code: Select all
cat > src.txt
Col1||@@##Col2||@@##Col3
Val1||@@##Val2||@@##Val3
$od -cx src.txt
0000000 C o l 1 | | @ @ # # C o l 2 | |
6f43 316c 7c7c 4040 2323 6f43 326c 7c7c
0000020 @ @ # # C o l 3 \n V a l 1 | | @
4040 2323 6f43 336c 560a 6c61 7c31 407c
0000040 @ # # V a l 2 | | @ @ # # V a l
2340 5623 6c61 7c32 407c 2340 5623 6c61
0000060 3 \n
0a33
0000062
$ sed 's/||@@##/\x7/g' src.txt > tgt1.txt
$ sed 's/||@@##/\x07/g' src.txt > tgt2.txt
$ sed 's/||@@##/\o7/g' src.txt > tgt3.txt
$cat tgt1.txt
Col1Col2Col3
Val1Val2Val3
$od -cx tgt1.txt
0000000 C o l 1 \a C o l 2 \a C o l 3 \n V
6f43 316c 4307 6c6f 0732 6f43 336c 560a
0000020 a l 1 \a V a l 2 \a V a l 3 \n
6c61 0731 6156 326c 5607 6c61 0a33
0000036
$cat tgt2.txt
Col1Col2Col3
Val1Val2Val3
$od -cx tgt2.txt
0000000 C o l 1 \a C o l 2 \a C o l 3 \n V
6f43 316c 4307 6c6f 0732 6f43 336c 560a
0000020 a l 1 \a V a l 2 \a V a l 3 \n
6c61 0731 6156 326c 5607 6c61 0a33
0000036
$cat tgt3.txt
Col1Col2Col3
Val1Val2Val3
$od -cx tgt3.txt
0000000 C o l 1 \a C o l 2 \a C o l 3 \n V
6f43 316c 4307 6c6f 0732 6f43 336c 560a
0000020 a l 1 \a V a l 2 \a V a l 3 \n
6c61 0731 6156 326c 5607 6c61 0a33
0000036
Is there any other approach I can try other than mine?
Thanks
Posted: Thu Oct 31, 2013 1:46 pm
by ray.wurlod
It's interesting that your target files from the sed commands have \a as their delimiter, not \007.
Posted: Thu Oct 31, 2013 1:52 pm
by mavrick21
Using 'a' instead of 'c' in od command
Code: Select all
od -ax tgt3.txt
0000000 C o l 1 bel C o l 2 bel C o l 3 nl V
6f43 316c 4307 6c6f 0732 6f43 336c 560a
0000020 a l 1 bel V a l 2 bel V a l 3 nl
6c61 0731 6156 326c 5607 6c61 0a33
Posted: Thu Oct 31, 2013 3:34 pm
by ray.wurlod
Ah. of course. BEL = "alert". Missed that nuance. So, back to the original question - when importing the metadata from the (target) sequential file, specify the field delimiter character as 007. It must have three digits when in decimal format.
Posted: Thu Oct 31, 2013 3:40 pm
by mavrick21
When I type in &H07 ( or &H7 or &H007) and click on preview, &H07 ( or &H7 or &H007) automatically changes to 007.
When I type in 007 and click on preview it stays 007.
Still doesn't work.
Posted: Thu Oct 31, 2013 5:34 pm
by ray.wurlod
I just tried it here, and 007 does work when the delimiter is BEL. This is during Import > Table Definition > Sequential File ?
Posted: Fri Nov 01, 2013 9:55 am
by mavrick21
Still doesn't work for me.
http://imgur.com/bDhY7TQ
I click on preview and everything shows up as just one field. I thought maybe Preview is buggy so I clicked on the next tab Define and still no success.
I'm working on DS 8.5 Server edition installed on RHEL ver 6.4 (64-bit).
Code: Select all
Here are the NLS settings on the RHEL box:
$ echo $NLS_LANG
American_America.WE8ISO8859P1
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
Any other suggestions?