The choice of Network Time Protocol (NTP) servers supporting NTS is still very limited. Here is some advice to get it to run smooth and trustworty.
(This is Part 2 in the NTS series.)
Actual NTP and NTS goals
The main user-visible goal of NTP is to receive accurate and stable time. It tries hard to identify clocks whose time looks wrong (“falsetickers”). However, in many setups, having one or two falsetickers as time sources may already cause your system’s time to be off, maybe even by an arbitrary amount. If an attacker can drop, inject, delay, or modify network packets, he is in control of your server’s time and thus in control of many processes, including security and safety related tasks.
NTS tries to prevent this. The foremost goals of the NTS protocol are identity and authentication. What you actually want is reliability and trust. As they are virtually impossible to ascertain by an automated protocol, identity and authentication are the closest matches and essential pillars to build Reliability and Trust.
How not to achieve them
With servers supporting NTS being as few and far between as they currently are, people are eagerly holding on to any NTS straws.
However, picking some random hosts from the Internet, just because they talk NTS, already misses the original premise of identity and authentication and has no chance of ever achieving reliability or trust. Using servers labeled as “test” may achieve the first two goals, but their maintainers do not want to guarantee the latter two.
Identifying criteria
NTS servers out there can be grouped into four categories:
- Officially sanctioned servers. Servers with an official or quasi-official duty to provide public, accurate and reliable time. They currently include NetNod/NTP.SE, PTB, and NIC/NTP.BR. (Although labeled as “pilot”, SIDN/Time.NL probably should currently be put into this category as well.)
- Corporations and organizations. Well-known organizations which as part of their internal and external services provide the time. Right now, Cloudflare seems the only one on this list.
- Community efforts. Individuals or communities trying to fill a hole. This is what we did in Switzerland, so far resulting in ntp.3eck.net, ntp.zeitgitter.net, and ntp.trifence.ch.
- Internal, development, test, random, and forgotten servers. The remainder, with no promises to their correctness or availability, neither implicit or explicit. Sometimes, not even the basic properties of identity and authentication are met.
This list shows a trend: If the top entries suddenly stop providing the service at a reasonable quality, a public outcry is to be expected: Among the taxpayers of a country, among the customers of the corporation, or among the supporters of the organization. We can expect them to go the extra mile to fix things quickly if anything should break unexpectedly, as they do not want to lose their reputation and the trust they have earned so far.
Community efforts are slightly weaker: There is less reputation to lose, but we can assume an intrinsic motivation to provide the service. So, there is at least an expectation to go the extra meter to keep things running smooth, if not the extra mile.
For the last category, however, there is a good chance that anyone will so much as blush when the service acts up. On the contrary, you might hear, “how the hell did you even find me?” or “I told you so!” as an answer.
So, unless you have additional information, only consider servers from the first two or three categories for use in your production NTS setup.
More servers!
To alleviate the NTS scarcity, there is only one option: More servers in the first two or three categories:
- Talk to official bodies to upgrade their existing NTP service to NTS.
- Convince corporations and organizations (including universities) to start offering NTP service or upgrade to NTS. If you have a business relationship with them, consider opening a ticket.
- Start your own community effort.
The first two are slow-moving and can take months or years, even if you have the right connections. But the third is where everyone can contribute, so let’s focus on that one in the remainder of this post.
How to create a trustworthy server?
This checklist provides a starting point for a potential user of your service. If this information is clearly specified, evaluating trust, reliability, identity, and authenticity of the NTS service becomes a breeze.
- Who operates the service?
- Is that person or organization interested in keeping the service up and running? (Personal commitment or afraid of losing the face)
- What is the access policy (may just read “public access”)?
- What is the current accuracy of the provided time?
- Is there a history of availability/accuracy? (Or is there enough organizational trust, that this is implied?)
So, if you want to provide a service which looks trustworthy, put that up on a web page. I would suggest to run at least some of the information on the NTS host itself: That machine already has a TLS certificate and public IP address, so running a web server with a static page should be trivial.
The easiest way of showing current and recent reachability and accuracy is to link to the NTP pool’s status or profile page. If you operate a public NTS server, why not also make it available to the NTP pool?
And even if you do not want to make a server publicly available (e.g., because the machine and/or its link is too weak), you can register it in the pool and select “Monitoring only” from the connection speed menu, as long as you do not overuse this. Note the “Not active in the pool, monitoring only” comment in the screenshot (highlighting mine).
In addition, you may also run something like ntpviz to provide more detail. Providing transparency always makes it easier for someone else to trust you.
Making your server reliable
Adding your server to the NTP Pool has another benefit: You will be sent an email, if your server’s quality (reachability, accuracy) drops significantly and it would become too bad to be included in the pool. Of course, you may always add your own monitoring, especially, if you want to become aware even of smaller quality drops or want to receive warning earlier.
Monitoring only checks the outputs of your server; it is your duty to make sure you select inputs with high enough quality: Stable power (consider a USB power bank for a Raspberry Pi), stable Internet connection (also think power and cables) and good antenna reception (if you have a GNSS/GPS input). Last, but definitely not least: A good set of upstream servers.
Upstream server choice
For a high-quality NTP server, you may want to have at least five good sources at any time. Also, the vast majority of your sources should be good NTS sources (more than ⅔). Otherwise, you might just end up receiving manipulated unauthenticated input, which you will upgrade to authenticated, and output garbage (GIGO). But now, this garbage will be in your name, with your reputation behind it.
In an ideal world (hopefully already next year), there will be enough NTS sources available to make a good selection from those. But now, you have a trade off between accuracy, round-trip time, or NTS capability of your sources, and number of sources in total.
I ended up with 5…7 stable NTS sources, most in the 10…20 ms RTT range. I also sprinkle in 1…2 low-RTT (~5 ms) unauthenticated sources. Resist the temptation to use low RTT servers whose clock is off or jitters.
Use non-NTS sources sparingly (at most 2) and only if there are not enough low-latency, high-quality NTS servers available. Your server will be redistributing that time as “authentic”, so make sure you vetted the sources accordingly. (And with non-NTS sources, you may never be sure of whether someone along the path may change the time recorded in the packets, as their contents are unauthenticated.) [Added paragraph 2022-02-22]
Anycast sources are great, if you “just” need the time. The anycast IP address you talk to may resolve to a different server every time. If you want to redistribute quality time, using an anycast server may turn out to be unpredictable or add jitter. Furthermore, Cloudflare will also load-balance your requests among different servers at the same location. Two of my machines are connected to the same ISP at the same location. However, they get time consistently differing by about 2 ms, the red line in the graph. Also, the response they get from Cloudflare indicates a different internal time source.
Having a local GNSS receiver (or DCF77 with phase modulation or …) is recommended. Due to infrastructure limitations, my GNSS receiver is was not at the same location as the main time servers, but behind a cable modem line. This results in the blue jittery line in the graph above. So try to avoid this setup (and, yes, I am working on a better solution as well…). [Updated paragraph 2022-02-22]
I am pretty satisfied with the current result:
- The servers have at most a few 100 µs of offset and jitter to their active neighbors.
- To all other monitored servers, even over long-distance connections, Chrony believes they are still within 5 ms, even worst-case. (Monitoring is run on a VM, which adds additional jitter. But this is good enough for a quick overview [Added VM disclaimer 2022-02-22].)
I hope that fine-tuning will improve jitter even further. I plan to move the GPS receiver to the same network as the main servers, away from behind the jittery cable modem connection. Due to antenna placement issues, this requires some construction, so it probably will happen only in a few weeks time.
Stratum-2 configuration example
Fiddling with GPS receivers is tricky. So you may want to start serving Stratum-2 time first, i.e., from a server without a local reference clock.
# Some close-by NTS servers
server ntp.trifence.ch iburst nts
server ntp.zeitgitter.net iburst nts
server time.cloudflare.com iburst nts
server ntp.3eck.ch iburst nts
# Up to 2 (to avoid them having a quorum)
# close non-NTS servers for stability,
# if not enough close NTS servers are available
server ntp13.metas.ch iburst
# A few more servers, monitoring/comparison only
server time2.uni-konstanz.de iburst noselect
server d.st1.ntp.br iburst nts noselect
server nts.netnod.se iburst nts noselect
server ptbtime3.ptb.de iburst nts noselect
server nts.time.nl iburst nts noselect
Stratum-1 configuration example
I use a Raspberry Pi 2B with GPS HAT and an active antenna, running Chrony. The clock is configured as follows (in addition to the fallback NTS servers):
refclock SHM 0 offset 0.5 delay 0.5 refid NMEA
refclock PPS /dev/pps0 refid GNSS lock NMEA prefer trust
Detailed instructions for a set up from scratch can be found on the GPSd pages, from Patrick O’Keeffe, or 0048ba. Also consider reducing Ethernet latency on the Raspberry Pi.
Improving Monitoring
[Added 2022-01-08] The NTP Pool monitoring situated in Los Angeles is known to rarely report false positives for (at least) European sites. Issues seem to be filtering/rate limiting by the ISPs or transient connectivity problems. A second opinion may be helpful, to differentiate between spurious warnings and actual problems. The preferred option would be for everyone to have their own monitoring, so you could also check your peers’. A quicker alternative might be to register with the development/test version of the NTP pool for monitoring purposes. This is not guaranteed to be up or reliable, even though it is in practice. One of the test monitors sits in Amsterdam, better for European sites.
Network latency and jitter from Amsterdam to European NTP servers is obviously better. However, it seems that the local clock of the test servers has higher drift.