Reliability of National Internet Segments
The connectivity of Internet at the network layer is a result of interaction between autonomous systems (AS), and it is more stable the more alternatives routes between ASNs there exist, which is basic fault tolerance principle. This research shows how outage of single, though significant AS affects the global connectivity of the region.
Global connectivity of any AS is based on its routes to set of Tier-1 ISPs. Tier-1s are transnational and transcontinental providers that provide connectivity service at world scale. If there are no active routes between AS and Tier-1s — this AS has no global connectivity.
Let’s suppose that a given AS is experiencing significant network degradation. We want to find out the answer to the following question: “What percentage of other AS in this region would lose its connectivity with the Tier-1 operators and therefore the global availability?”
Why are we modeling such a situation? Strictly speaking, when the modern Internet was emerging it was supposed that every AS would have at least two upstreams (higher level internet service providers) which would guarantee fault tolerance in case one of them fails. However, in reality, do failures of big transit operators ever occur? The answer is yes, and rather often. So if someone has not suffered yet, it is time to remind Murphy’s Law: “Anything that can go wrong, does.”
To model this scenario, we made following steps:
For every autonomous system in the world, we create all alternative routes to the Tier-1 operators with the help of AS relation model of the Qrator.Radar project;
Using the Maxmind geodatabase, we matched countries with every AS\' address space;
For every AS we made a normalization of the geodata to avoid a situation when the AS has a degenerate presence in the region. A good example of a region with such issue is Hong Kong: there are hundreds of members at the largest Asian Internet Exchange HKIX — which have at the same time zero presence at Hong Kong Internet segment itself.
After, we evaluate the effect of a possible failure of a given AS on other ASNs and, therefore, countries.
For each country, we find the autonomous system that affects the biggest percentage of other ASNs in given region.
Here you can see the top-20 of most failure-tolerant regions in 2017 and the updated 2016 results.
|% of failed networks||2016||Dynamics 2016 — 2017||2017||% of failed networks|
|2.57478||Germany (DE)||1st place||Germany (DE)||2.29696|
|3.14068||Canada (CA)||2 positions down||Hong Kong (HK)||2.65659|
|3.46469||Switzerland (CH)||3rd place||Switzerland (CH)||3.57245|
|4.03446||Great Britain (GB)||2 positions down||Canada (CA)||3.67367|
|4.19754||Hong Kong (HK)||3 positions up||France (FR)||3.68254|
|4.34753||Ukraine (UA)||2 positions down||Great Britain (GB)||3.76297|
|4.39691||U.S.A (US)||2 positions down||Belgium (BE)||3.93768|
|4.83975||Belgium (BE)||1 position up||Ukraine (UA)||3.95098|
|5.68121||Spain (ES)||8 positions down||U.S.A (US)||3.97103|
|5.78643||Poland (PL)||6 positions down||Bangladesh (BD)||5.29293|
|5.99955||France (FR)||6 positions up||Romania (RO)||5.35451|
|6.00547||Russia (RU)||1 position down||Brasil (BR)||5.39138|
|6.39252||Australia (AU)||Out of top-20||Russia (RU)||5.73432|
|6.88687||Ireland (IE)||14th place||Ireland (IE)||5.87254|
|7.0508||Romania (RO)||4 positions up||Czech Republic (CZ)||5.88389|
|7.43945||Austria (AT)||3 positions down||Poland (PL)||5.99655|
|7.84456||Italy (IT)||Out of top-20||Bulgaria (BG)||6.20975|
|7.97141||Bangladesh (BD)||8 positions up||Spain (ES)||6.58064|
|8.14681||Bulgaria (BG)||2 positions up||Austria (AT)||7.14221|
|8.15989||Philippines (PH)||Out of top-20||Luxembourg (LU)||7.28208|
As you can see, there are some changes from year to year. However, with the tenths of a percent difference, there are not many significant variations in the top-20 stable regions. Speaking about the top fault tolerant countries where a single major AS shutdown is affecting less than 10% of the region’s autonomous systems (there are 29 such countries) — all these countries have diversified IP-transit service market with lots of alternative routes.
Also, we want to highlight a significant influence of the AS 174, belonging to Cogent, on several regions: France, Great Britain, United States, Ireland just among top-20 countries. This means that issues in the AS174 could lead to problems even in several neighboring regions. Though outage of Cogent would not result in total unavailability because we are speaking about diversified and highly developed national segments.
Does the country’s biggest ISP always influence the regional reliability more than everyone else? Our calculations show that this is not always true. For example, in Germany, the biggest ISP is Deutsche Telecom, but when speaking about the connection reliability in the region, an outage of AS 8881, belonging to Versatel, would influence on the largest number of German ASNs. We believe that such trend of growing importance of Tier-2 ISPs would prevail in the nearest future.
Also, while speaking about trends, we could not mention the fact that the average “instability” in 2017 is 41%, which is 1,6% less than in 2016.
The most significant improvements in 2017 were achieved within the emerging Africa region. The failover capability of such regions like the Gambia and Liberia has grown significantly, almost by 40%. However, the movement towards fault tolerance of the world regions could not be named “unidirectional.” As an example, we could call the Jamaica, and the grown dependency on the stability of the single ISP, from 34% in 2016 up to 91% in 2017. External connectivity of this country almost completely relies on the AS 23520 (Columbus Networks) reliability.
The results of our survey make it evident that the ISP market built upon competition is, in the end, much more stable and failure tolerant, concerning issues within or outside a specific region. Vice-versa, a single AS failure could lead to network unavailability for a significant portion of users from a country or even a larger region.