China’s Web Outage: A Cautionary Tale for Online Retailers
Seeing as this event only impacted internet users in China, some retailers asked, what has this got to do with me? The answer is a lot. For retailers with a significant presence in China, such as Gap and H&M — H&M is growing faster in China than it has in any other market in its 66-year history — the outage meant loss of revenues for an entire business day. Or, consider the health and wellness brand Origins, for whom 55 percent of Chinese sales come from cities where Origins doesn't yet have a retail location. E-commerce is absolutely essential to Origins’ China strategy, and one full day of downtime is a huge loss. The collateral damage of this outage was also substantial, as many global companies took hits to brand image, lost costly advertising investments and experienced supply chain reverberations.
But even for online retailers who don't have a significant China presence, the outage provides a valuable lesson on the dangers of heavy reliance on external third-party services. This article will examine what exactly caused the China outage; why all online retailers need to take notice; and what they can do to minimize their vulnerability to similar web incidents in the future.
At around 3 p.m. local time on Jan. 21, two-thirds of all domain requests in China were routed to a single IP address in Wyoming, which promptly collapsed under load. This was believed to be a domain name system (DNS) attack, the biggest of its type in history. Not all domains were affected; mainly it was those ending in .com and .net, while those ending in .com.cn were partially affected.
Unfortunately, even most of the Chinese websites that weren't directly impacted also ended up going down. Here's why: many of the affected domains were hosts to third-party services relied upon by thousands of Chinese websites. One example is analytics engines. Never mind that the analytics engines weren't working, meaning that companies lost out on a whole day's worth of data that could have been used to increase conversions. That was just the tip of the iceberg.
Like dominoes, these "poisoned" third-party services brought down the websites they were feeding into, even those websites that were otherwise not directly affected by the attack. Another third-party service that went dark was PayPal. This meant that any website integrating PayPal on its back-end couldn't process transactions for a full eight hours, which was a moot point anyway because these websites were likely inaccessible.
Why Do Retailers Need to Care?
In recent years, there's been a dramatic upswing in the number of third-party services used by web properties overall. In fact, Compuware research shows that many organizations only control one-third of the time required to load a web page, as the rest is consumed by third-party services and content that aren't within an organization's direct control. With online retail growth continuing to significantly outpace that of brick-and-mortar, providing a competitive online shopping experience is essential.
As a result, online retailers as an industry are particularly reliant on third-party services, incorporating an extremely wide variety of them into their sites — from analytics to ratings and reviews to product tours to social media plug-ins. The possibilities for forward-thinking online retailers are endless, but there's also a lot to lose, since third-party services can have a dramatic impact on overall site performance — and thus conversions. Furthermore, third-party performance issues can be even more detrimental in the mobile web realm, a key area that online retailers are trying to exploit.
If third-party services can make a mission-critical, revenue-generating website so vulnerable to performance issues, is it worth it to use them? Like it or not, for most online retailers third-party services are a way of life and are here to stay. It's far easier to sign a contract with an advertising firm to help optimize the display of ads on a site than to try and design such a system internally. Areas such as analytics, social media, web fonts, and ratings and reviews are often drawn from services that websites don't directly control, but rely on to work efficiently and reliably at all times.
When these external services have an issue, it's the website owner that takes the hit to revenues and brand reputation, not the third party. After all, most end users don't know (or care) about how web pages are built; they just direct their blame for a bad experience back on the brand.
What Can Retailers Do?
In this era of increased dependence on third-party services, is there anything online retailers can do to experience the benefits while protecting and insulating their web performance? Fortunately, with certain approaches, the negative impact of major web service outages can be mitigated. Here are some tips how:
1. Be better about getting ahead of website performance issues. Given all the performance-impacting elements standing between your data center and end users — e.g., the cloud, content delivery networks (CDNs), ISPs, devices and browsers — the end-user perspective is the only reliable vantage point from which to gauge performance. New generation application performance management (APM) tools can deliver this view, and it's important to work with technology providers that provide performance views across key geographies and user segments. This can help online retailers identify performance degradations in key geographies (China or elsewhere) and take decisive action.
2. Closely evaluate and monitor third-party services. Before a third-party service is enlisted, carefully test its performance. Compare website performance before a third-party service is added in order to gauge the overall performance impact. If a performance degradation is identified, you must work with the third-party service to resolutely fix the problem before the service is implemented.
Monitoring third-party services in production is also important in order to validate service-level agreements, but also to identify third-party performance issues as they occur and take appropriate action. As the China example illustrates, the "ripple effect" of third-party performance issues is often unavoidable. However, that doesn't mean the impact can't be thwarted or minimized.
When a serious performance problem is detected, you should have contingency plans in place so that offending third-party services can quickly be removed. While they can be extremely valuable when performing well, many third-party services (e.g., analytics) aren't worth having if it means frustrating customers and preventing online orders from coming in.
3. The end-user experience needs to be top-of-mind in all third-party service decisions. In general, websites should keep third-party services to a minimum. Online retailers always need to ask themselves before adding a third-party service if the added feature/functionality is worth the potential increase in overall vulnerability and lost conversions. In this vein, there needs to be constant communication between performance monitoring teams and the teams that request and depend on these third-party services — usually marketing teams that are focused on driving traffic. This is key to making smart decisions that will protect and promote revenues above all else.
Additionally, when a third-party service is implemented, there are certain design steps organizations can take to proactively reduce risk exposure. For example, by understanding the load order of elements on a site and making sure third-party services and applications are on the bottom, organizations can protect and enhance perceived customer load time, even when a third-party service does suddenly go awry.
As a final note, to ensure better performance for feature-rich websites and applications, many online retailers rely on CDNs strategically located in key geographies. Ironically, CDNs represent another third-party service and another potential point of failure, especially since they're likely serving multiple customers experiencing "flash" traffic events. Here again, measuring performance from the true end-user perspective on the other side of a CDN is critical to protecting and maximizing these investments.
4. Leverage industry resources. Free services like Compuware's Outage Analyzer identify third-party service outages and the corresponding regional impacts. For example, around the 2013 holiday season, Outage Analyzer identified one third-party outage that impacted hundreds of domains. Services like this may not prevent major outages from happening, but they can help you at least see when a widespread performance issue isn't your own, giving you a head start in putting contingency plans into place and communicating proactively with customers.
Jane Lauder, global president and general manager of the Estée Lauder Origins, Ojon and Darphin brands, recently said, "When we look at China, we see growing opportunity, and we see a growing population and more consumers moving into cities. As a company focused on luxury beauty, we can't win unless we win in China." Overall, Asian consumers are exhibiting rising incomes and a heightened demand for style. The region is exploding as an e-commerce market — and not just for electronics and smartphones, but increasingly for apparel and clothing.
According to McKinsey & Company, the size of China's e-commerce market is expected to more than triple over the next three years, with sales reaching $420 billion by 2015. That's 20 percent more than what the U.S. e-commerce market is forecast to bring in that year. And by most measures, the market for e-commerce in China is still considered young!
Overall, the international media's reaction to the China outage was lax, which only served to downplay the significance of the event to those beyond China's physical borders. But make no mistake, the hit to revenues and brand reputation to leading retailers around the world was huge. To a certain extent, major web events like the one in China are unavoidable. The fact is, it's just one more example of the capricious, unpredictable nature of the internet and the impact it can have on modern websites. The hands of online retailers don't need to be totally tied, however. With the right approaches, online retailers can better anticipate, contain and minimize the impact of these incidents in key geographic regions, wherever they may be occurring.
Heiko Specht is the technology expert at the Compuware APM Center of Excellence.