Page 1 of 1

ETL Development standards, best practices, guidelines etc.,

Posted: Sat Aug 20, 2016 8:39 am
by anamika
Hello,
I have been asked to take on the task of evolving ETL architecture, design and development standards specifically using Datastage and the Infosphere suite of tools.
I have started gathering and reviewing documents, presentations searching the web.
I look towards this group for suggestions, comments, ideas regarding the above.
Thank you for reading this post and any contributions.

/A

Posted: Tue Aug 23, 2016 9:45 am
by qt_ky
Search on the IBM Redbook, IBM InfoSphere DataStage Parallel Framework Standard Practices. It has several chapters and appendices on these topics.

Posted: Wed Aug 24, 2016 8:32 am
by leandrohmvieira
There is a similar book for Server Job development?

Posted: Wed Aug 24, 2016 11:49 am
by chulett
Not that I recall on 'standards' and such. There has always be a Server Job Developer's Guide that shipped with the product or the online documentation that you can find here.

Posted: Wed Aug 24, 2016 12:22 pm
by anamika
Thanks everybody for pointing those links.
Yes, I have been reading up on lots of DS documentation, IBM Redbooks, general web searches and the like. Please do post if you think it is relevant.

Received some private messages as well.

Thanks

Posted: Sun Aug 28, 2016 12:32 pm
by PaulVL
Here's another non-documented practice you should instill in your developers right from the start.

Create Diagnostic jobs for your database connectivity.


Connector Stage -> Peek


select * from table_x WHERE 1=2


You will save so much time just by that stupid little job using the same parms as your real etl job. All it does is validate connectivity. You would never run it on a daily or nightly flow. It would only be used if you change your connectivity credentials or if your environment got hosed and you simply want a ping test to your database.

Executing your suite of diagnostic jobs the day before your big GO LIVE roll-out... gold.

Posted: Fri Sep 02, 2016 3:46 am
by ray.wurlod
... or run your real jobs in Validate mode.