Four Trouble Ticket Survival Tips
Sometimes the phrase ‘working the ticket queue’ is code for ‘doing meaningless work’. If you find yourself playing whack-a-mole with your ticket queue, then this is the post for you. You should strive to do meaningful work and this post discusses some ways to get more value out of the trouble ticketing process.
Trouble Tickets Reborn
Be honest, have your ever wanted to close a trouble ticket with a remark like ‘this is the same crap over and over again, kill me now!’ I’m being dramatic, but mindless repetition is a motivation killer when managing trouble tickets. Lets discuss the main causes of repetitious tickets.
1. Auto Generated Tickets
When your network management system (NMS) is badly tuned, it can generate a lot of tickets. A noisy NMS like this burns you twice; you have the initial effort of closing the tickets and, if happens persistently, you sensitivity to auto-cut tickets is reduced. This is an easy win in the ‘war on tickets’ and you’ll get a quick return from tuning your auto-cut tickets.
Event correlation involves taking a flood of tickets from an event and producing a single ticket to describe the event rather than the symptoms. A router power failure for example, might result in tickets for an unreachable router, failed WAN link and all the unreachable devices behind that router.
Event correlation is a laudable goal but is also a complex problem to solve. I recommend that you apply the 80/20 rule and address event correlation after fixing everything else. If you have a web access to your TT system, you can create GreaseMonkey scripts to help you bulk close those occasional ticket floods.
2. No Analysis
The aggregated history of closed trouble tickets is a mine of information, and the answer to your ticket woes is buried within. If you are expected to toil away at a ticket queue, it is right and fair to ask for ticket analysis. A report on the top-N root causes and ticket types will provide enormous value.
This isn’t easy. You still need to pull someone off the ticket queue to generate the report, and create a process for identifying the primary sources of pager pain. You also need to invest time in addressing the top root-causes. I know from experience that engineers have huge respect for managers who build intelligent ‘operations’ processes like these. Why? Because it shows engineers that their Ops efforts are valued and respected.
3. Poor Cause Codes
Great… let’s analyse this goldmine of information. Oh… every ticket was closed with ‘no-fault-found’. You may actually have a lot of mysterious events that require a deeper dive. However it’s more likely you have engineers who don’t think cause code matter much, or you have a really crummy set of options to choose from.
A ’cause code’ is the primary root cause of the issue described in the trouble ticket. I often close tickets with a root cause that isn’t quite right because I’ve got no better options to choose from. Of course you can have too many options, but I’d love to see a root-cause of ‘no-good-option’ where you add a suggested root cause in the ticket comments.
Failing that, you could just email your team and ask if they mind you adding a new root cause.
4. Triage failure
On other occasions you may like to add a tag of ‘triage fail’, when you think the ticket could have been resolved by a front-line team before it reached you. If you can’t prevent tickets being cut, the next best thing is to get Tier-1 to close them for you. However they need your help…
Does your front line team have a cheat sheet to diagnose and triage problems? Have you ever seen it? If they don’t have access to a helpful FAQ or cheat sheet, you’ll continue to get poor quality tickets. Can you help them generate such a resource?
Sherpa Summary
Take action to reduce your ticket pain:
- Tune your NMS to make Auto-cuts more relevant
- Make your closure-cause-codes relevant and meaningful
- Report on and analyse your ticket histtory
- Help your Tier-1 support do better triage.
- Summary
- Generate Data => Analyse Results => Address Top Root Cause => Tune Process/CauseCodes [REPEAT]
Do you have any Trouble Ticket wrangling tips to share?