With the rapid rise of the Internet industry, there are higher requirements for data storage and processing. Giants have built data centers to support daily business development, and thus seize a larger market share. However, taking 2018 as an example, there are not many outage incidents of cloud service providers at home and abroad. The causes are also different, but all have serious consequences.
On February 15, 2018, the emergence of a database failure on the Google application development platform caused a lot of annoyance for customers of the platform. Users of Google's PaaS service Google App Engine experienced errors and delays of up to an hour. Gamers were most affected by this incident, because many popular online games use Google services. And like Pokemon Go and Snapchat are also hit.
On the morning of July 24, 2018, Tencent Cloud was down due to the physical interruption of the Guangzhou operator's optical cable. According to the news, Tencent Cloud's downtime has a greater impact, resulting in a full disconnection of Tencent Cloud's Guangzhou area, including Tencent Cloud's homepage, console, DNSPod and so on.
On October 16, 2018, many users experienced webpages that could not be opened, users could not log in, and video playback was interrupted. A similar situation occurred on the web and mobile, and the entire interruption time exceeded 2 hours.
For social networking giant Facebook, November 2018 was a bad month, and the two outages affected a large number of users of corporate collaborative products. The Facebook service, including the Workplace collaboration tool, went down on November 12. Before the service was restored, it received thousands of complaints. In a short time, "FacebookDOWN" became a hot topic on Twitter.
Just over a week later, on November 20, Facebook had another downtime incident. This is the third major Facebook downtime incident since August. Three-quarters of users reported that they experienced full downtime or difficulty logging in from 8 am to early afternoon.
Downtime accidents are not uncommon, either because of software or hardware. For example, disk space is exhausted, server performance is insufficient, data is lost, and so on. It may also originate from external environmental factors, such as aging equipment, broken wires, etc.
Preventing accidents before they happen, how should we prevent such accidents?
1. Back up important data in time;
2. Make relevant advances in advance;
3. Replace the aging equipment in time;
4. Daily regular maintenance.
The methods are very simple, but the focus is to be done in daily work, I hope we will not suffer from such troubles.