The new world of Project Server 2007 and the architectural changes are catching a few of our customers out - and I thought I'd share a few tips and tricks for keeping the queue flowing - and some tips for getting things moving again if they appear to have stopped.
First I will point to a great TechNet article on the Queue (*** 12/18/2014 update - try this link now http://technet.microsoft.com/en-us/library/cc197395(v=office.12).aspx ***) and as you will all have read this then my explanations will make more sense :).
Under Server Settings in Project Web Access the Manage Queue option allows you to see what is happening in the project and timesheet queues - if you don't have admin access then the Personal Settings will give you a glimpse of your queue jobs. The latter option may not however give you the complete picture and allow you to see what might be ahead or you. It is like being stuck on the highway and not being able to see around the corner to where the flashing lights are...
So lets start with some definitions:-
Waiting to be processed - means exactly what it says. Once I get to the front of the queue then I am ready to go. But there may be other active jobs ahead that will stop my job starting even if I am first in line. The queue is clever enough that it will hold jobs back if their processing would interfere with other running jobs. An example might be a publish job that will need to wait for a cube build to finish.
Processing - means that I made it to the front of the queue, was allocated a thread and am working away! One thing I have noticed is that the % complete indicator doesn't always make you think that "processing" is happening - but generally it is. Looking in the ULS logs, event logs or at general server activity (particularly the Microsoft.Office.Project.Server.Queuing.exe process should help if you have continued doubts that processing is moving along.
Skipped for optimization - is the queue's way of telling you that it is not going to do the same thing twice. Some queue jobs have a payload (such as saving a project) and others are merely instructions (such as publish a project). If several of the same instruction are in the queue, then only one needs to be actioned. An example might be working on a project and publishing a few times during a period of time. If the queue was busy all of these jobs might be sitting waiting for a while - and then rather than doing each in turn it just needs to do one. It is just an instruction to publish the content of the saved project. This would not happen with a queue job that had a payload as each of these contains real data that needs to be applied - rather than just an instruction to do something with data somewhere else.
Getting Queued - appears to be one of the more confusing messages. I mentioned above that some jobs, such as save project from Project Professional, have a payload. This payload goes into the queue as a group of related messages, which then get processed once they reach the front of the queue. Getting queued means that these messages are going into the queue. It is possible that the Getting Queued message appears for some time because a very large project is coming in across a very slow link. One other potential problem that can break things is if this flow in of messages does not complete. Perhaps the Project Manager saving the project shuts down Project before it completes - or perhaps goes out of wireless range midway through the process. Either way the Getting Queued could sit there for some time. To fix this up find the person who has this project in mid-save and get them to reconnect and complete the job. As a last resort you can cancel the Getting Queued - but YOU WILL LOSE DATA! Any changes the Project Manager made will not get saved. To protect you from inadvertently canceling one of these jobs we add a check box under Advanced options labeled "Cancel jobs getting enqueued" which will need to be checked before these jobs can be canceled.
Failed and Not Blocking correlation - is a failure that is isolated and not stopping any other jobs from processing. The term correlation is used to group related queue jobs together. There should be an associated error message and entries in the log to help explain the problem.
Failed and Blocking correlation - means that something bad happened that is also blocking other things in the related group. If a save fails then a publish could not continue would be one example.
Success - is the one message we like to see! It can also be useful to sometimes show the Success messages (by default they are not shown in the Manage Queue display) as it is a way of seeing if the queue is working at all. Adding the completion state of Success through the options on the manage queue page is how this is done.
Canceled - means what it says. It could have been canceled by a user, but it is also possible for jobs to be canceled by the server. One example would be a failure early on in a save from Project Professional. A job would have been added to the queue for the save - but reconnection may lead to cancellation of this job and the addition of another save job - it really depends hoe far the save got before the problem. I simulate bad things like this by pulling my network cable out just after hitting save - just to see what happens!
I will follow up with another posting on the queue with some further tips on troubleshooting -but my parting gift is a guide to what the dialogs at the bottom of Project Professional 2007 mean during a save.
Technorati Tags: Project Server 2007
Excellent content here Brian. This site is often my first port of call.
I have a client who has created a Test Project Server environment by taking a backup from Production and restoring it.
Unfortunately their Job queue never gets processed in this Test environment. I can cancel jobs and add jobs (e.g. Saving a project) but all jobs end up as "Waiting to be processed"
I have been reading the MS article you have linked to above and been thinking about the following statement.
"Every job polling thread looks for jobs originating from a specific instance"
Does this mean the Queue Thread will not process any jobs because it is looking for jobs from the "Production" instance of PWA and not the "Test" instance?
The document at http://technet2.microsoft.com/Office/en-us/library/348d1f05-5cc6-41bb-be2c-ea28bbf1c3421033.mspx?mfr=true may help understand what might be wrong. Possibly the queue service didn;t start correctly. You should see 1 queue service process, plus another for each SSP that has PWA sites configured. So usually a minimum of 2. If you just see one then try re-starting the service. It can fail to start in some configurations if SQL Server is not accepting connections at the point the queue service starts - I often see this behavior in virtual or single server systems.
It should find any instance of pwa for processing - even in a system restored from another server.
We also have a problem with queue stuck. After small investigation of log we saw the following message:
[RDS] The RDS message will go to sleep because a RDB Refresh is in progress.
As we understood Reporting Database is in the middle of refresh but it is not true. There is no jobs in queue with type "Reporting Database Refresh" all stucked jobs are of type beginning with "Reporting..." for example (Reporting (Project Publish), Reporting (Resource Sync), Reporting (Lookup Table Sync)). Our PS2007 installed in stand-alone mode (all on 1 computer). Where can be the problem?
P.S. By the way Reporting DB is almost empty (no projects, no tasks etc).
I had a problem with Project just before this that caused me to kill the MS Project application and restart. Shortly after that, I re-started MS project, re-opened the project and then saved and published.
Then a mail came with the subject "Your queue job ReportingProjectPublish failed. Please contact your administrator for assistance", then I checked the queue, I find the type of the queue of the project showed:Failed But Not Blocking Correlation. What causes this issue? What shall I do to prevent it from happening? What shall I do with the queue? retry or cancel?
I can't say what caused the crash of Project, but this could certainly upset the queue if something is in the middle of precessing. I would try "retry" first, and if it still fails then just "cancel" this job and then re-publish the project and see if all goes through the next time.
Sorry for the delayed response. The reporting database refresh can take some time - and certainly if for some reason it failed (such as rebooting the server) then it may never finish but still think it is running. I would suggest opening a support incident to dig deeper into this issue. For options see http://support.microsoft.com.
My environment is a 'single server' installation. Everything was going smooth initially ( saving, publishing, cube generation, etc). but now all jobs go into 'waiting to be processed' state and never beyond.
No where it gives any error/warning message. I have checked in all log the files. Can anyone put some light on this issue ?
First stop is to check your queue service is running - and it isn;t enough to just look in services (but that is a good place to start). If you open Task Manager on your application server you should see 2 processes for queue and 2 for event service (or more if you have more than 1 SSP). If you just see one try restarting both services - they may have timed out when last started contacting SQL Server. Also if SQL has been down and up they may need re-starting just to re-gain the connection (this last piece is fixed in SP1).
Let me know how this goes.
Not sure if this is a red herring for my issue but I get 'Skipped For Optimization' in the Queue whenever I do the below:
My issue is approved tasks are not updating the project Plan. If I create an Activity and assign a task to a team member, the user changes the percentage complete for the task and the task gets Approved, we are not seeing the update on the Activity. The 'Preview' shows the update to the Activity Plan, but the actual Activity does not get updated. This only thing I see logged is 'Skipped For Optimization' in the queue.
Any ideas anyone?
Sorry to waste your time Brian, the Activites we were trying to update were checked out, so not receiving the task updates. You can cancel my last blog comment.
Keep up the good work, your site is invaluable!
I've kept your comments on here so that the answer you found might help someone else. Generally the skipped for optimization means that there is another job in the queue that will do the same thing - so it doesn't need to process both. So for every skipped for optimization I would expect to see a success at around the same time.
Thanks for quick response and I am sorry for this delay from my side !
I observed, under 'Process' tab of 'Task Manager' only 1 process for queue and 1 process for event service are running on the application server. I have tried restarting both the services but with no difference.
While I try 'New Task' option to start any of those services from the 'Task Manager '(Applications), it throws a message saying "Cannot start services from command line or a debugger. A Windows Service must be installed and then started with the ServerExplorer, Win services Admin tool or the Net Start command"
How do I make 2 processes for each queue and event services running ? Is there any Win service missing on the server ? Please help.
I would look in the ULS logs. It should be giving some indication there of why the second instance isn't getting started. You cannot start this manually. When you start the service the service itself will check the database to see which SSPs have Project instances provisioned and will start this 2nd instance (and 3rd, 4th if necessary). If it has a problem I would expect this to get logged. As another test you could try creating another SSP and another PWA site.
I have a job with "Failed and not blocking correlation" but it seems that in fact it does block any new job...
How can i cancel or delete this queued job?
If it says it isn't blocking then this is usually very reliable. Take a look for other, perhaps older jobs. You should always use Retry before thinking about cancelling - and also you do not say what type of job it is - so I wouldn't like to give guidance without more details.