Tom Fairbairn at Solace explains why website outages commonly happen and how they can be avoided
One of the more common technical glitches we see today is website outages, whether full-on website crashes or traffic overloads which result in temporary losses in service.
This common IT challenge frequently cripples major organisations, yet it can be mitigated. While website outages occur due to various reasons, common causes include; network issues, security attacks, software bugs and unmanageable traffic spikes.
The most recent outages making the headlines include the likes of Ticketmaster and UCAS. Taylor Swift Era’s tour ticket sales launch crashed due to a large volume of people attempting to access the website at the same time. Likewise, the UCAS system, on A-Level results day, failed to withstand the thousands of students accessing the website all at once, all hoping to receive their results and confirm their places at university.
One might expect that these websites should be ready to handle such large bursts of traffic, after all, it’s no secret that there’s immense interest, like in the case of Taylor Swift!
Despite these outages only being temporary, tending to be only a few hours at most, the problems caused in the aftermath of the outage can still be destructive. In a fast-paced environment where instant results are the norm, any event or problem that deviates from this expectation can hinder business success and customer satisfaction.
For businesses, outages often result in direct financial losses as they hinder sales and prevent revenue generation. Recent statistics from Gartner reveal that the average cost of an IT outage is estimated at £4,000 per minute, and 98% of businesses claim that a single hour of an outage costs over £80,000.
In the worst case scenario, losses can go into the millions. In 2019, Facebook was offline for 14 hours costing the social media giant $90 million in lost revenue.
Outages can tarnish a business’s reputation, undermine customer confidence, and foster dissatisfaction among customers. As individuals become more mindful of their online security and the dependability of the platforms they interact with, an unforeseen outage on a previously trusted website might prompt a shift in their perception. This shift could result in a reluctance to use the site or make purchases through it.
Furthermore, outages can impact a business’s reputation in the long term, potentially affecting search engine rankings. Diminished user trust and usage can contribute to a website’s decline in ranking.
Additionally, outages pose significant security challenges as they threaten to jeopardise customer data, leading to potential legal liabilities.
There is also the risk that it may expose the system to vulnerabilities that attackers can exploit, resulting in the temporary loss of security controls, and leaving the system at the mercy of the cyber-criminals during an outage.
Mind the ‘Transactional Gap’
There are many ways to improve the scalability of systems, such as microservices, asynchronous and/or event-driven architectures. It pays to carefully evaluate the costs, benefits and applicability of any approach like this: Amazon Prime Video found recently that the financial cost of their microservice architecture outweighed the scalability and agility benefits, making it important to evaluate options.
Event-based systems have in the past tended to focus on analytical use cases. In the case of e-commerce sites, this might be calculating conversion rates as and when needed. Analytical tasks like this tend to be high throughput, low-value data – losing data may affect accuracy but is not important at the individual level.
This has led to a “transactional gap,” where the cost of ensuring transactional data, which has lower throughput but requires much higher guarantees, is not well considered and so does not scale well. It’s no good being able to determine whether 10% of ten million potential ticket purchases want to buy a ticket if only 2% can actually do so before the booking system fails.
Analytical technologies are not well suited to operational use cases such as ensuring a given seat is reserved, and then released if the booking is not completed, at scale. Gartner has recognised this when it says “Treating event-driven architecture as a single-technology project can lead to scalability and performance issues or disappointing business outcomes”.
A real-time fix to outages
When a business experiences a large flow of data and traffic during peak periods, it can leverage event streaming as a form of data processing. Event streaming involves the real-time capture, processing, and distribution of events or data records underpinned by an Event Mesh.
In this context, an "event" typically represents a change or a noteworthy occurrence in a system or application, such as reserving a seat.
The implementation of an event-driven architecture (EDA) allows organisations to scale up and handle a large number of concurrent events occurring simultaneously. EDA allows multiple processing jobs to start while data is flowing to provide on-demand scaling, but there will always be bottleneck business processes, such as a central database of seat availability, that have tough requirements around the ordering of transactions and that transactions must not be lost.
Despite the rapid advancement of technology, website outages remain a persistent issue with fixable root causes. The repercussions of such outages can be far-reaching.
To mitigate their impact, businesses should explore event-driven technologies, which can handle large volumes of data in real-time and help detect and respond to issues faster. Once event-driven architectures have been deployed, businesses must maximise these new scalable solutions to improve the operational aspect of business.
Doing so will optimise the data at hand and ensure that outages no longer undermine sales. This means that as businesses innovate their technological advancements, they’re no longer limited by demand-side scalability challenges.
Tom Fairbairn is a Distinguished Engineer at Solace
Main image courtesy of iStockPhoto.com
© 2025, Lyonsdown Limited. Business Reporter® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543