IT Brief New Zealand logo
Technology news for New Zealand's largest enterprises
Story image

The importance of service level management to customer experience

By Contributor
Fri 27 May 2022

Article by New Relic APJ chief architect Peter Marelas.

Organisations face challenges in the rising cost of goods and services driven by a potent combination of COVID-19 and the great resignation. This has adversely impacted the supply of tech talent and created pressure on employees working on lean teams.

Staffing shortages have impacted site reliability engineers (SREs) in particular since they are under extreme pressure to ensure that digital assets perform at optimum levels 24/7. SREs are tasked with providing the best possible customer experiences with limited resources, while business leaders strive for responsive and error-free services while competing for market share.

Unfortunately, manually tracking performance and incident data is difficult and time-consuming and, in turn, frustrating for both IT and the business. But by adopting automation through a programmatic approach, extraneous human intervention can be a thing of the past.

Under the SLM hood

SREs are key to understanding exactly how customers experience a product or service and tracking system performance and reliability through customers' eyes. Service level indicators (SLIs) and service level objectives (SLOs) are central to every SRE practice.

SRE teams will often set strict SLOs on customer-facing components within their applications that support the SLA (Service Level Agreement) the business has agreed with customers. From here, the team can apply error budgets to understand how much tolerance they have to resolve issues to stay compliant with the SLOs, and, therefore, SLAs.

Service levels allow teams to express expectations through observability, which creates an objective, data-driven view of service delivery across the entire organisation. At a glance, business leaders can use service levels to oversee compliance across multiple teams and business units that reflects team and business performance related to the customer experience.

To reduce the burden on engineers in manually tracking performance and incident data, programmatically tracked SLIs and SLOs are foundational to SRE practices.

Defining relevant indicators and objectives

SLIs need to be relevant to a delivered service and should be simple and easy to understand. When an SLI underperforms an SLO target over the measurement period, it signals a business impact such as excessive unavailability or a sub-optimal user experience.

SLIs often focus on user experience measures. Typical indicators include latency/response time, error rate/quality, availability and uptime. Indicators that are less relevant to service delivery include CPU/disk/memory consumption, cache hit rate and garbage collection time. These indicators do not directly correlate with user experience unless resource saturation is present. 

The key to a useful SLI is to pick an indicator that is clearly and unambiguously related to service delivery, is simple to measure and most importantly, actionable.

Programmatic SLIs have three key characteristics: they're current, reflecting the state of a system in real-time; they're automated (they are measured and reported consistently by instrumentation, not by users); and lastly, they're useful, as they're selected based on what a system's user cares about.

With programmatic SLIs in place, engineering teams can easily automate tasks such as tracking the performance of service boundaries, end-to-end user journeys and measuring reliability across teams that fall within defined tolerances. They can also reduce manual toil because DevOps teams have a clear signal indicating when something is occurring that impacts users and, therefore, the business.

An important part of creating programmatic SLIs is identifying the capability of each system or service:

  • A system is a collection of services and resources that exposes one or more capabilities to external customers (either end-users or other internal teams).
  • A service is a runtime process (or a horizontally-scaled tier of processes) that makes up a subset of the system.
  • A capability is a particular aspect of functionality exposed by a service to its users, phrased in plain-language terms.

SLOs express the target objective that the SLIs must meet over a defined period of time.

SLOs should be easy for even non-technical stakeholders to understand. For example, for each SLI, create a baseline SLO using a statistic such as a percentile (e.g. 99%) that reflects the size of the population that must be satisfied by the SLIs over a rolling one week window.

In non-technical terms, this could be described as satisfying 99% of all user requests within the conditions defined by the SLI over the period. Importantly, when using statistics to characterise distributions, averages should be avoided as they fail to capture extreme conditions present in skewed distributions, which are common and can ignore the impact of service delivery for a significant number of users.

SLOs reflect the entire population consuming a service over a period of time. If there are different cohorts with different SLAs attached to service delivery, separate SLOs should be defined that track and measure the cohorts independently.

SLOs are designed to balance behaviour amongst members of DevOps teams and ensure the customer remains front and centre in any activity that could risk non-compliance with SLAs. To achieve this in practice, teams' daily activities must be guided by the current state of SLOs. When an SLO is trending in the wrong direction, teams should revert to activities and behaviours that bring the SLO back in line. Once SLOs recover, regular activities can resume.

At cloud-based payments player Zico, using a Service Level Management feature that automates tasks has been key in enabling its engineers to visualise and report on the company's service level indicators and objectives as well as calculating error budgets. It breaks down the process of defining an SLI and setting the targets into an easily understandable and repeatable process for the engineering teams.

Establishing SLIs and SLOs will result in a simpler and more responsive observability practice, tighter alignment with the business, and a faster path to improvement. To lighten the load on SREs, providing the right tools that can automatically configure and deliver meaningful SLIs and SLOs will be key.

Related stories
Top stories
Story image
Tech job moves
Tech job moves - Bitdefender, Cohesity, Fortinet & MODIFI
We round up all job appointments from June 27-30, 2022, in one place to keep you updated with the latest from across the tech industries.
Story image
Ivanti puts spotlight on power of employee digital experiences
The report revealed that 49% of employees are frustrated by the tech and tools their organisation provides and 64% believe this impacts morale.
Story image
Artificial Intelligence
Juniper study reveals top AI trends in APAC region
Juniper's research shows an increase in enterprise artificial intelligence adoption over the last 12 months is yielding tangible benefits to organisations.
Story image
New VMware offerings improve cloud infrastructure management
VMware has unveiled VMware vSphere+ and VMware vSAN+ to help organisations bring benefits of the cloud to existing on-prem infrastructure.
Story image
Oracle Cloud Infrastructure expands distributed cloud services
“Distributed cloud is the next evolution of cloud computing, and provides customers with more flexibility and control in how they deploy cloud resources."
Story image
Evonik relies on Getac F110 tablet to control autonomous robot
The aim of the project is to evaluate the practicality of an automated robotic maintenance and inspection solution in the chemical industry.
Story image
Unknown connections: How safe is public WiFi in Aotearoa?
If it's not your own household WiFi, then who has control of your data and is your connection actually safe?
Story image
Great Resignation
New SAP study uncovers impact of 'the great resignation'
Coined in 2021, the phrase 'the great resignation' refers to millions of employees globally leaving their jobs. The phenomenon is real and impacting SMEs.
Story image
Data ownership
Brands must reclaim trust by empowering data ownership
According to Twilio's new State of Personalisation Report 2022, 62% of consumers expect personalisation from brands, and yet only 40% trust brands to use their data responsibly and keep it safe.
Story image
Stock security features inadequate in face of rising risk
"Organisations must proactively find ways of identifying unseen vulnerabilities and should take a diligent, holistic approach to cybersecurity."
Story image
Industry-first comprehensive risk-based API security enhances protection
Application Programming Interfaces (APIs) have become a crucial part of operating web and mobile application businesses and are causing significant economic growth in the digital sector.
Story image
Monitors are an excellent incentive for getting employees back
The pandemic has taught us that hybrid working is a lot easier than we would’ve thought, so how can the office be made to feel as comfortable as home? The answer could be staring you in the face right now.
Story image
How Airwallex helps businesses achieve globalisation success
As markets continue to shift, businesses need to be able to provide the same quality of service for customers regardless of where they are located around the world.
Story image
Up to $2.4 million shortfall in the collapse of IndeServe
We delve into the liquidators first report on long-standing networking service provider IndeServes collapse.
Story image
Voice recognition
Renesas and Cyberon expand services with voice recognition
“We are honoured to collaborate with Renesas to simplify the development of embedded voice recognition functions."
Story image
How the metaverse will change the future of the supply chain
The metaverse is set to significantly change the way we live and work, so what problems can it solve in supply chain management?
Story image
Zero trust security adoption rises 27% in just two years
A survey of WAN managers has revealed that multi-factor authentication and single sign-on are the top zero trust features implemented.
Story image
Financial results
Margins & revenues up at New Zealand arm of Acer Computer
We look at the local financial statements of Taiwanese manufacturer Acer Computer Inc.
Story image
SNP unveils next generation of CrystalBridge software platform
Data is a key pillar of every customer-centric organisation, as it relies on agile decisions to become increasingly sustainable and intelligent.
Story image
Video: 10 Minute IT Jams - An update from CyberArk
Olly Stimpson joins us today to discuss the importance of MSP programmes and how MSP partners are experiencing success with CyberArk.
Story image
SAS wins Microsoft ISV 2022 Partner of the Year award
"We formed the SAS and Microsoft strategic partnership with a shared goal of making it easier for customers to drive better decisions in the cloud."
Story image
The best ways to attract young talent during labour shortages
New research from Citrix reveals hybrid working and ventures into the metaverse are top of mind for Gen Z workers.
Story image
Internet of Things
ManageEngine wins big in IDC MarketScape assessment
ManageEngine's Endpoint Central service has been recognised as a leader by IDC MarketScape in several categories including Internet of Things device deployments and UEM software for SMEs.
Story image
Oracle Cloud
Commvault, Oracle to deliver Metallic Data Management as a Service
"We are excited to partner with Commvault and enable our customers to restore and recover their most mission-critical cloud data."
Story image
Identity and Access Management
Ping Identity named a Leader in Access Management
Ping Identity has been named a leader in the 2022 KuppingerCole Leadership Compass report for Access Management. 
Story image
Artificial Intelligence
Accenture shares the benefits of supply chain visibility
It's clear that gaining better visibility into the supply chain will help organisations avoid excess costs, inefficiencies, and complexity to ultimately improve their bottom line.
Story image
Video: 10 Minute IT Jams - An update from CrowdStrike
Scott Jarkoff joins us today to discuss current trends in the cyber threat landscape, and the reporting work CrowdStrike is doing to prevent further cyber harm.
Story image
Microsoft names A/NZ Partner of the Year award winners
The awards recognise partners across the globe for their innovative use of Microsoft technologies to help customers succeed.
Story image
Hybrid workforce
Why hybrid working is here to stay and how to ace it
Citrix's new report reveals hybrid workers are more productive and engaged at work than their office and completely remote counterparts.
Story image
FIDO Alliance releases guidelines for optimising UX with FIDO Security Keys
The new guidelines aim to accelerate multi-factor authentication deployment and adoption with FIDO security keys.
Story image
ASI Solutions named finalist of Microsoft Surface Partner of the Year
"ASI Solutions has a strong Microsoft focus, building value by helping customers maximise investment in modern workplace solutions."
Story image
Hybrid workforce
How organisations can prepare for a post-pandemic workforce
The so-called 'new normal' office looks different to how it did pre-pandemic, and organisations need to take steps to better manage their post-pandemic workforce. 
Story image
NOWPayments launches new service to analyse cryptocurrency fees
NOWPayments has launched a new network fee optimisation solution that analyses current network fees and picks the most profitable option out of the client's payout wallets.
Story image
Robust digital warehouse management crucial in Asia-Pacific
Thanks to a network of “cloud” stores, grocery and food delivery providers such as Foodpanda can arrange for these commonly requested items to get packed up and sent over in almost no time.
Story image
Web Development
Whitecliffe fosters careers for the future of tech
Do you want a career in Information Technology, Networking, Web Development, Software Development, or are you looking to upskill?
Story image
Why is NZ lagging behind the world in cybersecurity?
A recent report by TUANZ has revealed that we are ranked 56th in the world when it comes to cybersecurity - a look into why we're so behind and what needs to be done.
Story image
Airwallex launches global payment services in New Zealand
The launch will enable businesses in New Zealand to tap into Airwallex's global payments services, offering an alternative to traditional banks.
Story image
Digital Transformation
What CISOs think about cyber security, visibility and cloud
Seeking to uncover the minds of CISOs and CIOs across Asia Pacific, my company recently asked Frost & Sullivan to take a snapshot of cloud adoption behaviour in the region.
Story image
How to achieve your monthly recurring revenue goals
Monthly recurring revenue (MRR) is the ultimate goal, the most important issue on which anyone in the IT channel should focus.
Story image
How TruSens air purifiers can create healthier workspaces
The pandemic has heightened our awareness of our own and others’ health, and made us all much more conscious of the environments we work in.
Story image
Hybrid Cloud
HPE GreenLake advances hybrid cloud experience with new services
"The innovations unveiled today further build on our vision to provide the market with an unmatched platform to spur innovation and drive transformation.”
Story image
New study reveals 51% of employees using unauthorised apps
The research shows that 92% of employees and managers in large enterprises want full control over applications, but they don't have it.
Story image
Blasé attitudes to cybersecurity by business a national risk
The largely unregulated state of cybersecurity in NZ, and consequential ambivalence of most businesses, risk hurting the country's trading prospects.
Story image
Dark web
Cybercrime in Aotearoa: How does New Zealand law define it?
‘Cybercrime’ is a term we hear all the time, but what exactly is it, and how does New Zealand define it in legal terms?