A different Requirement
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 29
- Joined: Tue Aug 14, 2012 1:06 pm
A different Requirement
Hi All,
I have a CSV file with only 1 row containing 10 columns .
all the columns have data type as Integer.
My requirement is to pass only the columns in the target CSV file having value not equal to zero.
I have a CSV file with only 1 row containing 10 columns .
all the columns have data type as Integer.
My requirement is to pass only the columns in the target CSV file having value not equal to zero.
I'm not quite sure what your requirements really are; but text files need to be read line-by-line, so the reader sequential stage will read the whole line with all of the columns. After that, one can selectively remove columns from the stream that have specific values. But how about column naming and should each row (assuming the file gets more than one row) have a different number of columns?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 29
- Joined: Tue Aug 14, 2012 1:06 pm
-
- Premium Member
- Posts: 29
- Joined: Tue Aug 14, 2012 1:06 pm
-
- Premium Member
- Posts: 376
- Joined: Sat Jan 07, 2012 12:25 pm
- Location: Piscataway
To build on Craig's answer, this is what you could do
Code: Select all
+----+ +----+ +------------+
| seq|--------------->|Xfm|--------------------------->|Col Import |------>Target
+----+ +----+ +------------+
Read as Use field function
1 column and check if value is 0.
If no, append to delimited
string else do nothing
Jerome
Data Integration Consultant at AWS
Connect With Me On LinkedIn
Life is really simple, but we insist on making it complicated.
Data Integration Consultant at AWS
Connect With Me On LinkedIn
Life is really simple, but we insist on making it complicated.
That's close to what I was thinking.
It doesn't need to be read as one column so much as written out that way. And I don't think the Column Import stage is needed. As the final step I would trim off the 'extra' trailing delimiter you'd get from concatenating the columns and delimiters together:
It doesn't need to be read as one column so much as written out that way. And I don't think the Column Import stage is needed. As the final step I would trim off the 'extra' trailing delimiter you'd get from concatenating the columns and delimiters together:
Code: Select all
A,B,C,D,E
0,5,4,0,1
A:
B: 5,
C: 5,4,
D: 5,4,
E: 5,4,1,
Last step: 5,4,1
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
You can try doing it in Unix (assuming that your environment is unix). an awk script can brilliantly come to your rescue. But if you need column names, that will be complex as the selection of columns is dynamic here.
You can alternatively use Craig's solution which has been discussed in one of my posts (link below):
viewtopic.php?p=443641#443641
You can alternatively use Craig's solution which has been discussed in one of my posts (link below):
viewtopic.php?p=443641#443641
I'm thinking you would handle the column names the same way... dynamically paired with the matching non-zero values. It would need to be written out as a row of data rather than automatically by the stage, so perhaps a horizontal pivot? Or a funnel to bring the two rows together.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
-
- Participant
- Posts: 134
- Joined: Tue Jun 15, 2010 2:10 am
- Location: Bangalore
You can read the one row as a number of rows ( with only one column of type integer ) if your Record Delimiter property is set to Comma, in the transformer , pass only rows where integer data is not zero ( Using a transfomer constraint Or Filter stage Where clause) and write this to another file with record delimiter as a Comma .
For both file stages ( source and target) set the field delimiter and quote character to 'none'
To Access record Delimiter property , go to Format tab of the sequential file, and choose the Record Level property folder node, change the record delimiter from Unix Newline to Comma (,) .
For both file stages ( source and target) set the field delimiter and quote character to 'none'
To Access record Delimiter property , go to Format tab of the sequential file, and choose the Record Level property folder node, change the record delimiter from Unix Newline to Comma (,) .