kgrube 3 Posted November 3, 2016 Share Posted November 3, 2016 (edited) There's a current known issue where agents will continue to check in if they are online, but never execute any commands. This can cause the script queue to quickly fill up because the script engine will never clear out the scripts on these machines. Left long enough, no scripts will be able to run until the script queue is cleared out. Machines exhibiting this behavior can be identified because all commands on the Commands tab will be listed as executing. Firstly, clear the script queue to allow new scripts to run. Either restart the database agent to clear all scripts, or delete from runningscripts manually. There's two routes to go down now: Do a one-off fix through ScreenConnect Create autofixes One-off through screenconnect: (Optional) Delete from pendingcommands to clear queued commands. Either run this on all your machines (restarting every agent) or create a new session group limiting to machines identified as 'stuck' Log into ScreenConnect web interface Select machines Right click -> Run command Paste in this one liner: net stop ltsvcmon & taskkill /im ltsvcmon.exe /f & taskkill /im ltsvc.exe /f & taskkill /im lttray.exe /f & net start ltsvcmon & net start ltservice This will stop the service monitor, kill off LTSvcMon.exe, LTSvc.exe, LTTray.exe, then try to restart everything. Internal monitor/autofix overview: Create internal monitor Create Autofix, call from internal monitor Create an internal RAWSQL monitor. Note: change the limit on the last line to whatever allows this query to run on your system. SELECT DISTINCT computers.computerid AS `TestValue` , computers.ComputerID AS `IdentityField` , computers.computerid AS `ComputerID` , acd.NoAlerts , computers.domain , acd.UpTimeStart , acd.UpTimeEnd FROM commands JOIN computers ON computers.ComputerID = commands.ComputerID LEFT JOIN AgentComputerData acd ON computers.computerid = acd.computerid WHERE commands.Status = 2 AND commands.DateUpdated < DATE_ADD(NOW(), INTERVAL -15 MINUTE) AND commands.DateUpdated > DATE_ADD(NOW(), INTERVAL -60 MINUTE) AND Computers.ComputerID IN ( SELECT ComputerID FROM commands WHERE status = 2 GROUP BY commands.ComputerID HAVING Count(*) > 2) AND Computers.LastContact > DATE_ADD(NOW(), INTERVAL -15 MINUTE) AND computers.os NOT IN ('Darwin', 'Linux') limit 0, 50 Set Table to Check and Field to Check to `RAWSQL`. Set Check Condition to GreaterThan Set Result to 0 Set Identity Field to `computers.computerid` Paste the code block above into Additional Condition Create an autofix script Import the attached script files Put psexec.exe in L:\Transfer\Apps\toolkit(Or edit the script to where it exists on your L: drive already) Edit lines 5/6 for your info categories for the types of tickets this will make Edit line 22 with the script id of this script in your system. Make sure the last line in the script points to the CTS-Autoclose-Update-Generic function script (I don't know how these script exports work) Add the script to the internal monitor Overview of how the autofix works: Check that the machine is checking in See if the machine's already been fixed by injecting a FasTalk command into `commands` table(Script execution will pause if you try to run an actual command and the script will never complete) Delete all non-complete commands for this machine from `commands` Delete all running scripts (except this one) for this machine from `runningscripts` Delete all pending scripts for this machine from `pendingscripts` Delete all remote monitors for this machine from `agents` Find another machine at this machine's location that is not having the problem Run PSExec commands from this other machine to restart the agent on the original machine Inject another FasTalk command into `commands` table Create/update/close ticket based on value of @status@ variable. As always, use this at your own risk. If you have any feed back for me, please let me know. I'm new at this 'exporting scripts' thing, so it's probably borked. Script contents: lts.zip Edited November 10, 2016 by Guest Quote Link to post Share on other sites
Darrell_Null 0 Posted November 3, 2016 Share Posted November 3, 2016 The problem we ran into with this is the that commands to fix the first machine get stuck on the second machine and are now waiting for a third machine to be dispatched to fix the second one which will delete the commands on the second one leaving the first machine and possibly the second machine broke. Possibly a never ending cycle until you run out of machines at that site to try to fix the other ones. Quote Link to post Share on other sites
jwontorcik 0 Posted November 4, 2016 Share Posted November 4, 2016 We have been seeing a lot of stuck commands on our servers. For some reason monitors are getting stuck at removing and installing. So may have give this a try. Quote Link to post Share on other sites
dsinton44 1 Posted November 4, 2016 Share Posted November 4, 2016 good work Quote Link to post Share on other sites
tlinman67 0 Posted November 4, 2016 Share Posted November 4, 2016 Been working with LT support on this issue for over a week. It is a known issue across multiple partners. Their workaround was the one off through ScreenConnect as mentioned above. When you have 3700 agents, its a pain. Quote Link to post Share on other sites
jwontorcik 0 Posted November 4, 2016 Share Posted November 4, 2016 I been working with LT support for about week too. Just found out it's known issue today. I thought it was just us having issue. We have 5200 agents and we get about dozen a day with stuck commands. It's getting old quick. Quote Link to post Share on other sites
kgrube 3 Posted November 4, 2016 Author Share Posted November 4, 2016 I been working with LT support for about week too. Just found out it's known issue today. I thought it was just us having issue. We have 5200 agents and we get about dozen a day with stuck commands. It's getting old quick. We have a similar number of agents and are seeing around 500 after a day or two or doing nothing, that's why I wrote this script. This script has a pretty decent success rate but anything that blocks PSExec will cause it to fail. Quote Link to post Share on other sites
bigdessert 22 Posted November 7, 2016 Share Posted November 7, 2016 First thanks kgrube for the excellent write up. I made some modifications and also created a screenconnect plugin to assist with this and other possible tasks. Create an autofix script 1. Upload screenconnect plugin to your App_Extensions folder under your screenconnect server. 2. Enable plugin in screenconnect administration. 3. Edit the plugin file service.aspx and change the value of the key as this is the only security(like a password in very lose terms) 4. Import the attached script files 5. Edit lines 5/6 for your info categories for the types of tickets this will make 6. Edit line 26 with the script id of this script in your system. 7. Edit line 39 with the id of your LabTech server(Usually 1 but can be a different computerid) 8. Edit line 42 to match the key in your screenconnect key variable in the service.aspx file. 9. Make sure the last line in the script points to the CTS-Autoclose-Update-Generic function script (I don't know how these script exports work) 10. Add the script to the internal monitor autofix changes: Run screenconnect command on machine from screenconnect instead of using PSEXEC and a machine on the customer network. The ScreenConnect Plugin: the plugin contains two functions currently. You can target these with HTTP GET requests. Function ExecuteCommand(): This will take a variable data(URLEncoded) and send the command to the screenconnect guest that matches sessionID To Use: https://yourscreenconnecturl:port/App_Extensions/8e78224d-79db-4dbb-b62a-833276b46c6e/Service.ashx/ExecuteCommand?key=aljgdlkajglkjalksjgdl&sessionID=88b7b298-3664-4670-b1ff-bbb61843cc07&data=ipconfig Function IsOnline(): This will take a sessionID and let you know if the guest is connected to screenconnect or not. Returns 1 if connected 0 if not. To Use: https://yourscreenconnecturl:port/App_Extensions/8e78224d-79db-4dbb-b62a-833276b46c6e/Service.ashx/isOnline?key=aljgdlkajglkjalksjgdl&sessionID=88b7b298-3664-4670-b1ff-bbb61843cc07 Labtech Script.zip Labtech Helper - ScreenConnect Plugin.zip Quote Link to post Share on other sites
kgrube 3 Posted November 7, 2016 Author Share Posted November 7, 2016 Thanks datacomm, this is sweet. Much higher reliability than psexec alone. Quote Link to post Share on other sites
dsinton44 1 Posted November 8, 2016 Share Posted November 8, 2016 ive had the same issue working with LT support, looks liek it may be an issue with WEbroot causing CMD commands to hang. the script utilizing screenconnect looks very interesting Quote Link to post Share on other sites
Darrell_Null 0 Posted November 8, 2016 Share Posted November 8, 2016 We are having trouble finding the service.aspx file in the plug-in. Can you provide us the path to where it is located? We looked in the zip file and do not see it in there. Quote Link to post Share on other sites
bigdessert 22 Posted November 8, 2016 Share Posted November 8, 2016 Sorry it was service.ashx should be in the folder with the guide string. Quote Link to post Share on other sites
Darrell_Null 0 Posted November 8, 2016 Share Posted November 8, 2016 Ok. Found it. Thanks for the quick response. Quote Link to post Share on other sites
vkent39 2 Posted November 8, 2016 Share Posted November 8, 2016 I keep getting this error on our test systems any ideas? ERROR: The process "ltsvcmon.exe" not found. Also what has everyone found to be a good amount of commands a machine can handle before it locks up? We are seeing this happen on almost all servers and probably 25% of all workstations. Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 9, 2016 Share Posted November 9, 2016 I reported this issue to Labtech on the 27th of October. Reported it to Webroot on the 1st of November. Webroot is actually suspending other process also. We have think it has been causing issues with terminal servers and connecting to the server. Labtech tells me they are in contact with Webroot. Webroot tells me that only two clients have reported the issue. If you are having this issue please report it to Webroot. If we all do not tell Webroot to work on the issue it is going to continue. Doing manual processing or stuck commands is not what we are paying Labtech or Webroot for. I started to create a script to look for the stuck processes on systems because some systems have it stuck but is not backlogging anything currently. Quote Link to post Share on other sites
jwontorcik 0 Posted November 10, 2016 Share Posted November 10, 2016 I have reported it to LT support and not Webroot. I was told LT development is working on a fix. So LT is telling you Webroot is causing these issues? Are you running 10.5 or 11? We are on 10.5 patch 8. Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 10, 2016 Share Posted November 10, 2016 Labtech is reporting to my ticket that they are working with Webroot as it appears to only effect Labtech Partners with Webroot. I have clients that are using their own AV and we do not have the issue with them, Symantec, Vipre. Up until late today when communicating with webroot they told me that only one other webroot client has reported the issue to them. (We used Vipre before webroot and webroot detected things Vipre never saw) Our nbext step is to start looking at bitdefender if webroot cannot get this issue resolved soon. From our findings processes by Labtech and some that have nothing to do with Labtech are getting suspended. Until you turn off webroot or uninstall it you cannot delete these suspended processes it will come back as Access denied. If you do kill the parent process it orphans the suspended process and those start to build up on the system. We are concerned that this is the reason we had 2 different Terminal Servers stop allowing connection until rebooted. We have not been able to isolate that issue but once we started to clean up the suspended process the issue started to diminish. Late this afternoon I got am email for the escalation tech at webroot that they are working on the issue with Labtech and the work around is to uninstall webroot, reboot, setup a policy to turn off automatic update and install a previous version with the no update switch while installing. We are currently on 11 in a cloud instance with Labtech. The issue we have currently is finding the systems that have suspended processes before they queue up a bunch of commands. Some of these commands are scheduled reboots that are getting missed. I can appreciate the complexity of the issue and the time it takes to get a solution in place but the lack of communication from both webroot and labtech is frustrating. I know as an MSP if we communicated with our clients in the same way we would be losing clients. Unfortunately I have found that this is true with most RMM providers. We came from Kaseya and they were much worse then Labtech. Labtech does and excellent job on several layers but these little nagging issues that cause manual steps to perform to get the product to work right it is counter productive and labtech management just doe not care, at least in my experience with them and trying to get my concerns heard from high level management. Quote Link to post Share on other sites
Theuns 0 Posted November 10, 2016 Share Posted November 10, 2016 Hi Want to add our experience re Webroot over the last 2 months across ± 900 seats. I can confirm that Webroot breaks the Terminal Server logon process on many of our TS servers. We went through many upgrades, policy changes etc. and it is still happening. You may have received the email this morning about setting the ‘Self Protection’ to low. This leads to another shortcoming in the portal – You cannot apply changes globally across sites – if you create a new policy you have to go to each site, search for servers and then apply the new policy. I feel really sorry for anyone with lots of sites and wanting to implement something new. We had Webroot blue screen many workstations using wireless keyboards. We had to re-register Revit after install WR as it breaks the registration – so Architect’s and related users cannot work after a WR install. Revit even have a KB about this issue. IF you use Xero Accounting’s , import bank statement it will not work with FNB in South Africa – I closed the request at WR after weeks of no feedback. We suspect that WR is also breaking some versions of Sage (Pastel Accounting) – we have numerous reports from clients pointing in this direction. The worst thing is we had 2 ransomware infections this week where WR was installed, running, up to date ETC. and it still did not prevent it from happening. WR could not ‘roll back the changes’ as is promised everywhere and the files was lost or had to be recovered from backup. There are other less prominent problems with using this product. 1. Reporting is very limited 2. We had quite a few issues with websites being blocked and an inability to control this 3. The unclear policies regarding what data is being collected and stored in US data centers 4. Etc. I wish we had the above information 2 months ago, perhaps our experience will help someone else regarding implementing WR or NOT. Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 10, 2016 Share Posted November 10, 2016 Individually as MSP and large IT departments we tend to get brushed off like it is isolated issue. We continue to find a "work around" to situation that are pushed upon us because there are not enough voices calling out to demand it be fixed. I would like to figure out a way we can help each other provide a unified voice to these providers to get action done quicker. I have a list of items that have been reported to Labtech in particular that still have not been fixed. For instance anyone notice that the search for online system in the advanced search returns systems that have not been online for more then a month. This forum is a good place to share information but I doubt it has any impact on Labtech or Webroot. Even the Labtech Forum seems to have no impact on Labtech. I still like to stress that Labtech product is in my opinion better then the others that we looked at but their support and SLA are right in line with the rest, poor an not adhered to. Quote Link to post Share on other sites
GorillaBiscuit 0 Posted November 10, 2016 Share Posted November 10, 2016 can I change the 'value of the key' to anything I want? Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 10, 2016 Share Posted November 10, 2016 Just found a server that Webroot suspended an Azure function for the office 365 dyrsync. Watch for that. Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 10, 2016 Share Posted November 10, 2016 Follow up on the suspended Azure command, I went to send webroot the logs with their wsalogs.exe program and webroot suspended that process. I have also reported this to webroot several times. Quote Link to post Share on other sites
brentn 0 Posted November 10, 2016 Share Posted November 10, 2016 Will have to give these work-around scripts a try - thanks to those who posted them. We're also seeing this issue and LT support confirmed it. I suspect a fix might be a while now that most of LT is at IT Nation. The reports in here of Webroot blocking a number of critical functions is concerning to me since we've been moving customers from SEP to Webroot. We had no issues with SEP but Webroot costs less and our own internal review of it prior to adopting it as an offering showed it worked well. Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 10, 2016 Share Posted November 10, 2016 We were using the Vipre cloud product before switching to Webroot. We are much happier with Webroot then we were with Vipre. I still stand behind the switch to webroot. With that said we cannot just accept the fact it breaks our environments and just do the work around. We need to let Webroot and Labtech know that it is broken so they can see the full impact of the issue. The more we report it the more they can collect data. Right now I have 9000+ commands that are stuck. (Look at "Dataviews>Commands>Executing commands" in the Labtech control Center) and for the last two weeks I have spent my days releasing commands and rebooting systems. You can delete them form the MySQL query with "SELECT * FROM commands WHERE STATUS=2" this will remove them from the list to process but it doe snot fix the suspended commands. You can restart the system but it will return. You can stop webroot, kill the processes then restart webroot. You can restart the labtech services but when you do this it leaves orphaned processes in suspended mode. The "work arounds" are great to get the issue remediated temporarily but, as we all know, the more we have to do to get it to work correctly the less the product becomes automated. Quote Link to post Share on other sites
helpdesk@envisionITP 0 Posted November 10, 2016 Share Posted November 10, 2016 Webroot support response this morning: however unless the issue is readily reproducible this is unnecessary as we are already attempting to reproduce this issue within multiple environments on our side in addition to the environments Labtech Support is using with their attempts. I have offered my servers with a reproducible issue. If anyone else has a server that can have the issue produced on demand please contact webroot. Quote Link to post Share on other sites
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.