Page 1 of 2

nameless folder appears, project corrupt?

Posted: Wed Jul 22, 2009 2:49 am
by telenet_bi
hi,

We have a 7.5 project which got corrupt. We cannot logon with director or manager( it works for other projects on the same machine). We can however still logon with designer. When we do this we see a nameless folder directly under the jobs folder, which has a copy of the complete folder structure we used to have (and also itself, so it looks like an infinte list).

When we run a job from the commandline it still seems to run ok.(can't check on warnings and logs, but the records in the DB look ok)

Allready tried to delete a job from this new folder and it deleted the job on both locations(the correct and the one under the nameless folder). Allready tried reindexing, didn't give errors, but also didn't change anything.

anyone has seen this before?

Posted: Wed Jul 22, 2009 3:00 am
by ArndW
You are going to have either delete the project then restore a backup or roll up your sleeves and be prepared to do a number of actions from the TCL command line level. By far the most preferable solution is to start from a restored backup and with a "clean" project. Do you have a backup or will you need to get dirty in the entrails of DataStage?

Posted: Wed Jul 22, 2009 3:19 am
by telenet_bi
We don't have a recent dsx backup, we do have a filesystem backup but that's always a risk, since this isn't done when the server is down.

Currently we are creating a backup, from client commandline it does seem to work.

I would still like to try and fix it before reverting to the backup. What object you think would be corrupt?

we've tried:
-reindex all: no error, but also no change.
->UVFIXFILE DS_JOBS

Beginning TRACE of DS_JOBS.
TRACE of DS_JOBS completed.

Scanning free buffer chain.
Scan complete.

Scanning overflow buffers.
Scan complete.

201 group(s) processed.
257 group buffer(s) processed.
3971 record(s) processed.
Number of data bytes = 282000.

->COUNT DS_JOBOBJECTS

111169 records counted.

- DS.CHECKER gave no errors

Posted: Wed Jul 22, 2009 3:47 am
by ArndW
Since you are still at Version 7 the UNIX level backup would probably get you going again, unless the system was very busy with developers creating new jobs while the backup was running. Just stop DataStage (to ensure that no users can login), rename to "bad" directory, restore it from backup and re-start DataStage.

But if you wish to start repairing try the following (bolded text is input):

>SELECTFL

4785 record(s) selected to SELECT list #0.
>>SAVE.LIST AW

4785 record(s) SAVEd to SELECT list "AW".

Now you will have a file called ./\&SAVEDLISTS\&/AW, which will contain one DataStage logical hashed file name per line. Edit this file with your favorite editor. Add a new first line with just two letters, "PA", and globally replace all remaining lines so that they read "RESIZE {filename} * * *" then save the file and go back to TCL

>EDIT.LIST AW
4785 lines long.

----: SAVE VOC AW
"AW" filed in file "VOC".
Bottom at line 4786.
----: QUIT
>COMO ON X
COMO X established 11:35:44 22 JUL 2009
>AW
(lots of output, you need to enter "n" once to suppress screen waits)
>COMO OFF
COMO completed. 11:44:07 22 JUL 2009
>DELETE VOC AW


1 records DELETEd.
>DELETE.LIST AW
Saved list "AW" DELETEd.
>


Now look at the UNIX sequential file ./&COMO\&/X and see which files, if any, show blink or other errors that might indicate broken files.

Posted: Wed Jul 22, 2009 5:51 am
by telenet_bi
we tried the easy solution, restoring the backup, but it seems we only have a backup that allready has this problem. So I did the steps you suggested.
The produced file has a lot of entries:
- Date/time stamp in file header has been modified!
- RESIZE: Invalid file name, sizing parameter, or option on command line.

nothing with blink. What does these entries mean?(or is this a result of the backup restore, do we first need to reindex?)

Posted: Wed Jul 22, 2009 5:56 am
by ArndW
Both those messages are OK, the first one just means that nothing much was done to the file, the second means that the file is not a resizable hashed file (usually this refers to directories).

This means that your DataStage files are internally consistent, which is an important first step.

Next, ensure that no users are logged into the project and from TCL do a "DS.REINDEX ALL" to rebuild all secondary keys in the hopes that this is the only problem you have.

Posted: Wed Jul 22, 2009 6:49 am
by telenet_bi
We tried the DS.REINDEX ALL , this resulted in an empty project. All files are still there in the unix-dir, but the list of jobs trough a client is empty.

Posted: Wed Jul 22, 2009 7:05 am
by telenet_bi
retried reindex (after releasing locks) and now we have the joblist back, we however still have the same problem of the nameless folder.

Posted: Wed Jul 22, 2009 7:23 am
by Sainath.Srinivasan
Can you export all jobs from folders other than the empty folder ?

Posted: Wed Jul 22, 2009 7:28 am
by telenet_bi
We can't open the manager, but from a commandline script we tried some jobs(the script we had available needs a joblist) and this seems to work.

We now started this for the full project, and it will take some time( almost 4000 jobs in the project).
If we have this we'll try to clear the project and reimport. If anyone has any faster ideas they are very welcome.

Posted: Wed Jul 22, 2009 7:40 am
by ArndW
If the file lists are empty after the DS.REINDEX ALL that means that you still have sessions or clients connected and thus the indices could not be rebuilt. Try it again. After it completes, enter "LIST DS_JOBS WITH JOBTYPE EQ 3 REQUIRE.INDEX" - if no jobs are listed then the indices are still not in order and you need to find and stop the rogue sessions.

Posted: Wed Jul 22, 2009 7:50 am
by telenet_bi
we indeed did the reindex again and now see the list of jobs as it was (so with the same nameless folder) we cannot connect with director or manager .
the command 'LIST DS_JOBS WITH JOBTYPE EQ 3 REQUIRE.INDEX' gives us a very long list, so I think this is ok.

Posted: Wed Jul 22, 2009 8:04 am
by ArndW
could you please post the output from the command " LIST DS_JOBS BY CATEGORY BREAK.ON CATEGORY FMT 64L DET.SUP TOTAL EVAL '1' FMT 4R"

Posted: Wed Jul 22, 2009 8:08 am
by telenet_bi
>LIST DS_JOBS BY CATEGORY BREAK.ON CATEGORY FMT 64L DET.SUP TOTAL EVAL '1' FMT 4R
1
LIST DS_JOBS BY CATEGORY BREAK.ON CATEGORY FMT 64L DET.SUP TOTAL EVAL "1" FMT 4R
04:07:33pm 22 Jul 2009 PAGE 1
Category........................................................ 1...

2
0_Workflows 3
0_Workflows\AVAYA 3
0_Workflows\Aggregates 4
0_Workflows\Arbor 47
0_Workflows\Arbor\E2E 5

...


temp - Iperform 3
waarborg 3
====
3967

3967 records listed.
"" not found.
"" not found.
"" not found.
"" not found.

Posted: Wed Jul 22, 2009 8:14 am
by ArndW
Ok, you can edit your previous post and remove a bunch of the details to make the thread more legible.

You have non-displayable characters in your DS_JOBS file. Let us try to localize where they are and if they can be removed.

What is the result of "LIST DS_JOBS ID.SUP EVAL 'OCONV(@ID,"MCP")' WITH CATEGORY UNLIKE \..."