2020 Glitch Guide and the Tools That Can Help
Here’s a look into the network glitches experienced last year, and some free tools that can help monitor and solve for future mishaps.
January 10, 2020
As we ease out of the holiday season and into a new year (and a new decade!), January provides us with an opportune moment in time for reflection on the past and goal setting for the future. While we certainly aspire to achieve our New Year’s resolutions, sometimes it doesn’t always go our way. In a similar spirit, for the solutions tech professionals develop and use—sometimes they don’t perform as perfectly as intended.
SolarWinds asked its THWACK community of over 150,000 technology professionals to share about the times this year when their hybrid or cloud applications and systems acted more naughty than nice.
Here’s a look into the glitches that almost turned some tech pros into a grinch this past year, and the gift of free tools that can help monitor and solve for future mishaps, brought to you by the SolarWinds community.
1) Hiccups in Cloud Migrations and Integrations
“Moving the files and e-mail for 17,000 employees from internal DC's to the cloud proved to be a less-than-comfortable experience. Due in great part to proofs of concept demonstrated with small numbers and sizes of files, scaling to an Enterprise sized solution was somewhat of a failure. Think running out of NAT/PAT pools on firewalls during the data transfers, running out of physical firewall resources, huge slowness in Outlook and Word performance compared to when we hosted it internally, etc. Two years later it's still an issue, and we're looking to see if changing firewall platforms (brands and models) will improve performance to something closer to what we had pre-cloud. All that slowness is an unadvertised / unexpected result of gaining all the ‘benefits’ of the cloud.”
-- User: rschroeder, Network Analyst
“For us, it was a cloud integration service we use to stuff data into our CRM from various data feeds. I suppose it's really hybrid because there is an on-prem piece. At any rate, it's been glitchy enough to be annoying without actually failing completely. At the moment (knock on wood) it's fine but since we never did isolate the actual issue, we're not completely sure we've got it fixed. Spent way too much time, effort and frustration on it this year.”
-- User: df112, Director of Database and Systems Engineering
To solve these glitches, give the gift of: AppOptics Dev Edition, NAT Lookup
2) Network Outages
“The biggest issue we have experienced to-date was Cyber Monday. The employees that were at work spent most of the morning surfing the web shopping. This impacted our Azure databases and slowed down the network. A quick modification of the firewall resolved the issue.”
-- User: aardav!1
“The biggest ‘glitch’ we had this year was spanning-tree misconfiguration. The previous engineer did not configure spanning tree properly. So when we added in new switches, it caused a spanning-tree convergence and took the network down briefly.”
-- User: wkwittke
To solve these glitches, give the gift of: AppOptics Dev Edition, Flow Tool Bundle, Traceroute NG
3) Mishaps with Implementing New Tools and Services
“The biggest glitch this year was in the switching of our Payroll system. The project wasn't run by IT, it was run by HR and Finance, and we strongly suggested they run in parallel for a couple pay periods to validate everything was accurate, but the vendor told them it would be fine. Four paychecks later and they still don't have all the bugs worked out, but hey the project was a success because it was implemented on time. Geez. Worst part, most of the organization thinks it was IT's fault, even though we were stonewalled out of the project.”
-- User: schneiderstev
“My biggest glitch(es) were over the deployment of [an SD-WAN solution] over 20 odd sites in the U.S. Carrier issues, configuration issues, general lack of understanding by our ‘Partner’ on deploying the service...It was a challenge I'm glad I'm done with.”
-- User: jscobbie, Manager of Infrastructure Engineering
“Our biggest glitch was likely moving to a new ITSM tool. Not so much that there were applications issues, but the project was a short timeline, so not all can be thought of or tested. Of course with something like this there is an adjustment period for all users. There will be additional work and development that will be done ongoing to relieve pain points. From an application point, there is slow connection to the application prior to the SAML authentication.”
-- User: neoceasar
“Moving to [a content creation, storage and collaboration service]. What a challenging task, early morning data transfers, where some of it wouldn’t upload as expected and would take hours to complete.”
-- User: desr, Network Administrator
To solve these glitches, give the gift of: SFTP/SCP Server, Traceroute NG
4) Stability and Performance Issues
“If I'm being honest, we've had one hell of a time keeping stability and performance up to par on our environment. Truth be told, it's more of a story of outgrowing hardware more than anything....but that was my biggest gripe over this past year. Thankfully Christmas came early for me when we were able to upgrade our systems across the board (hardware/software) and things have improved so much better!”
-- User: smttysmth02gt
“The greatest challenge to land on my desk this year was very unstable video conferencing performance, which turned out to be caused by the calls being routed to the other side of the world, even if you’re making a call within the same site. A relatively simple migration to a cloud service resolved the issued and finally my phone has gone quiet. (According to the exec, [video conferencing] is the most critical service in the company!)”
-- User: james_catlin
To solve these glitches, give the gift of: AppOptics Dev Edition, Call Detail Record Tracker
5) Latency Pains
“Users have [a leading suite of productive tools] installed locally on their computers, and connect to a fully on premise environment. [A team collaboration software tool] is in the cloud and and a "latency" issue stopped [the email application] from opening if they had previously connected to specific resource types.”
-- User: jm_sysadmin, Senior Systems Engineer
“[A storage container] bug was the biggest pain! With certain patch levels, your VM's start getting increased datastore latency. SolarWinds was showing it for ages. But no one believed me, so they looked at everything apart from what I was telling them. Network, Storage, Disk, VM's in [the server], testing it all but it came back clean every time. It’s only when me and one other forced them to look directly at the VM's to Datastore path and produced the latency while on the Host, they listened. Took over 2 months for slowness.”
-- User: grantallenby
To solve these glitches, give the gift of: Flow Tool Bundle, VM Monitor, Server Health Monitor, Real Time AppFlow Analyzer
6) Firmware Nightmares
“The biggest glitch/headache was forgetting that our hub was accepting beta firmware. We did in the very beginning need beta for a feature but could have gotten off once that beta became stable. We forgot the check box and got a beta release that broke a few things. After trying and trying we just had to roll back. Nothing that broke was critical just annoying.”
-- User: EBeach, Senior Network Administrator
“A certain patch along with supposedly some specific firmwares caused some of our hosts to become zombies. VMs ran but the host was basically offline. No snapshots, no console... Had to hard bounce the host when that happened. Not a fun few months.”
-- User: ststeven77
To solve these glitches, give the gift of: AppOptics Dev Edition
7) Rogue Notifications and Missing Communications
“Earlier this year, our cloud dashboard decided that it needed a vacation and started randomly sending out notifications, which were anything from ‘Your license is expiring’ to ‘Such and such switch has gone offline’ (when it really hadn't). Bad part about [the cloud service] is that we have no control over it. So, I sent a note to [the provider] and they looked into it. Turns out that there was a patch that had not been applied to our cloud and once that was done, everything was aces. Haven't had the Gremlins rear their ugly heads since. (Knock on wood, don't feed them after midnight, etc.)”
-- User: asheppard970, LAN Support Analyst
“Company-wide emails disappearing out of everyone's inboxes...Turns out if a user presses ‘Report Phish’ using our email filter, it quarantines the email for that recipient and removes it from their mailbox...But on company-wide emails, everyone is the recipient, so the email filter was quarantining from everyone's mailboxes if even one user hit ‘Report Phish.’”
-- User: zniets
To solve these glitches, give the gift of: Loggly Lite
About the Author
You May Also Like