Predictive Monitoring – Increasing the Value Gain while Reducing Visibility Gap
Author(s): Phillipe Bovy, Posted on October 25th, 2016
Monitoring Tools Should Focus on the End-User
Until very recently, IT was mainly focused on providing the most exhaustive service catalogue for businesses and its end-users and on making sure that purchased services would be delivered according to proposed SLAs.
But business users expect more than that: instead of meeting SLAs on how quickly an issue of poor service quality is resolved after a user’s complaint, IT should be measured on its capability to proactively anticipate issues, and thus to improve significantly the end-user experience.
End-user experience remains unmonitored
Nevertheless, Industry-leading analysts have established that 74% of end-user problems are not detected by current IT infrastructure monitoring tools. In most cases, the first and only indication that users are experiencing performance and availability problems comes when they contact the Service Desk. More statistics indicate that only 1 of 10 users actually DO call in, leaving the majority of impacted users “suffering in silence.” These silent sufferers experience a loss in productivity, reducing their expected output.
Existing monitoring tools are data centre focused, and provide minimal visibility into the true experience of the end-user, creating a gap in what can be seen versus what the end-users are really experiencing.
Fortunately, some tools on the market are helping IT managers keep an eye on what’s really happening in the IT environment without their needing to be informed via tickets raised by the Service Desk. This is what we call End-User Experience Monitoring (EUEM), a fairly recent type of software offered by vendors such as Unisys technology partners Aternity and NEXThink. The ability of these solutions to capture and report events (such as hung process, ‘Blue Screen’, application crashes, forced reboot, slow boot time) and their consolidation at enterprise level can provide a pretty good view of the health and performance of the IT estate in general and silent issues which could at any moment in time become true incidents in particular. As an example,
- If log reports show that some users are forced to restart their workstations after installing a specific software on their machine, this would indicate to the workplace engineering team that some required drivers could be missing in the standard desktop image;
- By the same token, if logs are reporting recurring ‘Blue Screen’ after the same operation is performed on specific hardware (e.g., launching of multiple applications in parallel on a specific machine type), then this would also mean potential registry conflict, or HW/memory issues to be also addressed by the engineering team as well.
Any of these events taken alone will be meaningless but tracking, consolidation and correlation of these events as a whole will definitely provide value to IT, who can quickly identify problems and act to minimize end-user disruptions, provided a robust Service Improvement process is in place.
Another immediate benefit of having EUEM in place is when its real-time analytics component is made available to Service Desk. When an end-user is reporting an incident, a Service Desk agent will be able to have access to the data and check if the problem reported by this specific end-user is isolated or if similar patterns have been notified for other users having the same IT environment. (PS – Accurate CMDB with updated CIs detailing HW and SW components will make correlation even stronger.).
In the event other similar cases do exist, Service Desk agents will be able to demonstrate higher ability for qualifying and documenting in the ITSM the potential issues, while at the same time routing the incident to the appropriate resolver group. This will definitely enrich and expedite the incident and problem management process with faster turnaround time and MTTR.
Seamless experience across devices
Nevertheless, predictive monitoring should not only concern physical end-points and just stop at the OS level, leaving the upper layers of the stack ‘in the dark’. Indeed, IT should figure out what the end-user sees from an application perspective as well. Analysis of parameters concerning latency, response time, and time out will give a good view of the application delivery chain from up to the end-user. In particular, understanding how users are experiencing utilization of corporate applications across different archetypes (fat client, virtual desktops, mobile devices, BYOD, etc..) is instrumental to ensure seamless user experience across these different delivery channels. This is indeed one step beyond traditional application performance management mainly focused on data center components (servers, load balancers, storage …) and mainly transaction-centric. End-user experience cannot be measured from the delivery vantage point of the data center. It can only be measured obviously from the end user’s perspective.
Add to the bottom-line
This is more than just a marketing fad: whoever manages to increase end users’ productivity by reducing latency, response time and time-outs or shutdowns – even if it is just one percent – will significantly add to the organization’s bottom-line. And this is simply not possible with the traditional system monitoring tools: you do need user-centric monitoring tool as well.