What do the data capture?
The data contains daily records of Internet availability and quality for a country or subnational (state/province) unit. Internet availability is based on the sum of all unique, active IP addresses in the country (or subnational unit) based on our sampling methodology described in the data description document. Internet quality i s based on the ping response time (in milliseconds) which measures the roundtrip time for a small data package sent from our scanning platform to each individual IP address. Ping response time is often referred to as Latency and should not be confused with Bandwidth. (see below)
Why do you focus on Latency rather than say, Bandwidth?
When on the internet, information (web-pages, data, voice, video) is being sent and transmitted from the user’s device or server, to a remote device or server. To make this trip, the user is connected via a series of intermediate routers and information carrying cables or mobile connectors to the destination.
Latency refers to the immediacy of the connection: how quickly a small ‘packet’ of information can travel back and forth between the user and the destination. Latency is critical for any synchronous online activity, including tele-work (e.g Zoom), tele-health, tele-education, tele-law, online gaming, and importantly, algorithmic trading.
Bandwidth refers to the capacity of the connection: how much information can be delivered over a longer period of time. Bandwidth matters for asynchronous online activity such as downloading or uploading large files, buffering movies and entertainment packages, downloading software and so on.
Due to the criticality of synchronous internet quality for key socio-economic and financial activities, we focus on, and provide, Latency measurements in our data.
To the best of our knowledge, our focus on Latency is a unique feature of our data.
Does the data contain information about browsing behaviour or web activity of individuals?
No, our data only measures Internet infrastructure availability and quality.
Does the data contain any personal identifiable information (PII)?
No, our data does not contain any personal identifiable information (PII) and is fully compliant with the EU General Data Protection Regulation (GDPR).
Do your measurements of Internet infrastructure also cover mobile Internet?
Yes, our measures also include Internet availability and quality from mobile infrastructure. However, these are to a large extent measurements of the availability and quality at the cell tower/site level and not at the level of individual mobile phones.
What type of backtesting data is available?
Our backtesting data consists of continuous, daily observations for the past 18 months (486 days, daily delivery from April 1st 2020 onward) for each of our data products. The number of observations available range from 486 national observations for the ADM0 US product to nearly a million sub-national observations for the ADM1 Global Product.
Are there any backtesting results available that show that KASPR Datahaus products can improve prediction models?
Yes. In our backtesting report, we show that in a one-day look-ahead model, KDH alternative data generates ∼ 5.6-7.9% excess return margin, relative to a baseline model excluding our data product.
Is your data available at the Internet Service Provider (ISP) level?
Yes. For data disaggregated by Internet Service Provider (ISP), please contact KASPR Datahaus directly info@kasprdata.com.
Can the ADM0 or ADM1 data be linked to individual companies/stock tickers?
It is possible to link our data to a company and stock ticker either directly and/or through the location of the company’s assets, supply chain, or markets. KASPR Datahaus can assist clients in this process.