Lessons Learned from Gathering MANRS Observatory Data
As part of the MANRS project, we collect significant amounts of data from around the Internet. We use this data for various purposes, including generating MANRS Readiness Scores to measure how well networks conform to the MANRS actions (learn more about how we do this). One thing we’ve learned is it’s rarely straightforward when dealing with this data!
Some of the sources we use provide data for a period of time and include data on all ASNs relevant to that period. But other sources (for example, some RIPEstat endpoints) provide data for an individual ASN. In these cases, we decide which ASNs to collect data for.
To define a list of ASNs to use to gather per-ASN metrics and to decide which ASNs to report on in the MANRS Observatory, we previously used a list we collated of ASNs that were visible on the Internet. For these purposes, we defined “visible” as having been seen by at least five peers during the period in question.
Expanding the List
Last December, we decided to expand our data gathering and use a more detailed list of ASNs. This helps us build a more detailed list of ASes, the names of their holders, and their RIR status.
We started pulling and parsing NRO stats data to generate a list of ASNs and statuses and linking it to RIPEstat data to annotate each ASN with the holder’s name. This also allowed us to cross-reference this list with the list of visible ASNs to improve how the MANRS Observatory was reporting on this data by filtering out ASNs listed as “reserved” (or worse, listed as “available”).
It is important to be careful when parsing out the NRO data to extract the list of ASNs, as some lines contain a range of ASNs and not just single a ASN. This makes little difference for “reserved” and “available” ASNs as we are generally not reporting on these ASNs in the MANRS Observatory. However, some ranges were listed as “assigned” (typically ranges assigned to an NIR). For these ranges, we were only gathering full data for the first ASN in the range.
From our calculations, of the 19,088 ASNs covered by these ranges, 11,268 of these were affected by the issue. These affected ASNs were still included in the MANRS Observatory as they qualified as being “visible” on the Internet, and most of the data for these ASNs was complete and up-to-date.
What is the Impact of this Change?
This issue directly affects data that underpins the metrics used to generate the MANRS Readiness Scores for Action 3 (Coordination) and Action 4 (Routing Information – IRR). When we process this data we always use the most recent data we have collected, so the effect would have been that these scores were frozen in time rather than missing completely. Action 4 (routing information – RPKI) was also affected but only temporarily as we recently switched to using our own ROAST data for this, which is unaffected by this issue.
A Learning Experience
We are always working to improve the MANRS Observatory by adding new features and data or improving the processing of the existing data. Unfortunately, in this case, what we thought was a small change to expand our datasets and improve how we display them caused an issue with a small number of data points for a small number of ASNs.
We always welcome feedback from the community on how we can improve the MANRS Observatory. If you have ideas for new features or notice any discrepancies with the data, please email us.
Leave a Comment