Your mileage may vary, but for us, windows update does not appear to be a defining factor. We have RU on Windows XP and Windows 7 machines in manufacturing that are not updated and have no general internet access, and these still suffer from this issue.
Rebooting after a windows update may appear to trigger the issue, but rebooting without any updates being applied is also a cause in our case.
The only workaround we have been able to make work is the service refresh as mentioned earlier in this tread. I have not noticed the problem being worse on the last couple of RU versions, but I cannot see an improvement either.
I initially thought I had this solved with my 2 tasks [1 to start service upon startup and s second running every hour] However, this was not totally effective in keeping all hosts online.
I have had to go a step further and add a task that performs a stop service then start service at 4am every day. Without this task, I was finding that some [not all] hosts would go grey, and never come back online. The service itself was running, but the host was just offline. It is not an issue that occurs every day, it seems very random. The best theory I have is that if the host loses connectivity to the RU server for an [as yet unknown] extended amount of time, it seems to give up trying, and never reconnects. Maybe there is a connection retry limit in the hosts somewhere.
The 4am daily restart so far appears to be dealing with this issue for me, and as I typically do not need overnight sessions to stay active, it does not have any negative impact here. However, my gripe is that if the remote host restarts and I have left a session running, the screen at my viewer end will show offline, but does not close, then when I have to close the viewer window, it takes about 5-8 seconds for each window to close, which is frustrating.
I have only implemented these recurring tasks [I call it the RUM; Remote Utilities Monitor] on a small number of our hosts, but overall, the reliability has improved significantly on these, whilst the non RUM'd machines are often problematic.
It would be great if RU could have its own watchdog service built in, so we would not have implement our own.
In my case, the issue is not firewall, or sleep or connection interruption. It is an RU issue. If it were not, it would be difficult to explain how it happens on dozens of machines, all in different geographical locations, on different internet provider connections, with different hardware, with different antivirus, connected through different routers, with varying sleep/no sleep settings.
I have resorted to a 3 scheduled tasks that check on startup for the service to be running, then check hourly again, and finally a third to restart the service each day at 4am. This has made the issue less of a headache, but it does still exist.
There is another issue that accompanies this for us, I am unsure if it is part of the same problem, and it is less common than the startup problem.
We notice it on hosts that run for long periods of time without reboots, sometimes the host will go offline in the viewer, and the icon on the host machine will turn grey. Attached are screenshots from 1 such host currently. In this case, the RU service is running. If I manually restart the service, the host will come back online, and is then I can connect. But it will not recover on its own. We do not have logging enabled on the host machines so I do not have any information to accompany this.
The timeout error is interesting. If the host is trying to start, and there is some dependency service that is not yet running, I guess that is a possible source for the issue, but in my case, I have previously tried setting the service recovery options to Restart the Service for 1st, 2nd and subsequent failures, but this was not effective in resolving the problem. Only a manual start request seemed to ever work.
I don't know that I will be able to provide this information. I do not always realise that the host is offline right away. We have almost 500 hosts and we do not log into them all every day. I will keep an eye out and see if I can catch a host going offline after a reboot.
Something I have found with the host service, is that even if I switch the service recovery settings to Restart the Service on first, second and third failures, this has not been effective in getting the service online after a reboot. It seems that if the service does not start on boot, only starting it manually will work to bring it back online.
I am surprised you cannot replicate this issue. I would say it happens on most of our client hosts quite regularly.
...I just re-read your post and realised you are not using the script. I am unsure if there will be any effect of running the start command when the service is already running. Probably not, but I guess you will know for sure after it runs for a few days.
The antivirus is probably picking up the script as unknown and untrusted program. I would say that it quarantined the RUMonitor.cmd file which is why it did not work for you initially.
Your way will still work fine, I needed a way to quickly deploy it to machines which is why I created the installer. Creating an exclusion in the antivirus will probably solve this for you if you wish to do that.
You should see no adverse affects in running this as the script executes no commands when the RU service is already started, it will only issue a start command if the status returns not equal to running.