Git, PGP, and the Blockchain: A Comparison

Git, PGP, and the Blockchain: A Comparison

The Blockchain, a cryptographically linked list with additional restrictions, is often touted to be the most significant innovation towards democratization of the digital landscape, especially the Internet. However, the ideas did not come out of thin air, but have ancestors and relatives. An attempt at technological genealogy.

Let’s first introduce some of the ancestors and then explain their relationships.

Haber and Stornetta (1991)

In 1991, Stuart Haber and W. Scott Stornetta investigated How to Time-Stamp a Digital Document.

The timestamping act proper was to be performed by a trusted third party, the Time-Stamping Service (TSS), using standard digital signatures. Cryptographically, a timestamp is indistinguishable from a digital signature; the key difference lies in the semantics:

  • A conventional digital signature is used to attest authorship of or agreement with the signed contents. Similar to pen and paper signatures, most digital signatures include the date and time at which the signing took place, as ancillary information.
  • In a timestamp, on the other hand, the date and time is the primary information. It can (and, for some protocols, does) use the same algorithms and message formats as digital signatures above. The signature itself does not imply the signer’s endorsement of the contents; it only implies that any document which matches the timestamp, already existed at that time (assuming the security of the underlying cryptographic algorithms). An example of such a timestamping protocol is outlined in RFC 3161 (2001).

Haber and Stornetta were unhappy to have to completely trust the TSS to not issue timestamps for anything else than the current time, especially their ability at backdating documents, so they proposed two distributed mechanisms which could be used to ascertain (or refute) the honesty of the TSS.

One of the mechanisms would today probably be described as a Distributed Ledger, one of the main properties of blockchains. Unlike blockchains, where the entire database is replicated, they proposed an optimization: Anybody requesting a timestamp would be given information about the next k timestamped documents. This information included the sequence number of the timestamp, the hash of the document, and the ID of the requestor.

As a timestamping client, it was in your own best interest to safeguard this information in addition to the actual timestamp itself. In case of a dispute, you could produce these k additional records and upon contacting their requestors, they would be able to ascertain the fact that your document was timestamped before theirs.

In essence, we have a distributed ledger where everyone has the incentive to store some small piece of information which can be used in their interest.

PGP Digital Timestamping Service (1995)

In 1995, Matthew Richardson created the PGP Digital Timestamping Service. At this time, email was still one of the main machine-to-machine (M2M) interfaces, as there were still hosts without always-on connectivity. Therefore, it was natural to use email as the interface. One of the modes was (and still is today) to send a document or hash thereof to the stamper services‘ email address. A few minutes later, a PGP-digitally-signed version of your submission is returned, including a timestamp and a monotonic sequence number.

Richardson did not want to put the burden of safekeeping to the end user. Instead, his system creates daily and weekly summaries of the timestamps it issued, allowing anyone to double-check the timestamps issued against these lists.

For many years, the weekly summaries were also posted to the Usenet group comp.security.pgp.announce, which—like most „announce“-Newsgroups—would be automatically archived by many servers around the world.

In summary, a chained public record of the timestamps issued would be created, preventing later modification of the history and thus rendering backdating timestamps impossible.

Revision control in a project. In DVCS, the numbers in the versions are unique hashes, not sequential numbers.
Based on Subversion project visualization, traced by Stannered, original by Sami Kerola, derivative work by Moxfyre and Echion2; CC BY-SA 3.0.

git distributed versioning (2005)

Software development, especially in teams, have used version control software since the 1970s to keep track of parallel development, experiments, bug fixes, or to determine how bugs got introduced into the system. Traditionally, this has been based on version numbers and central repositories, neither of which worked well when trying to create derivative works: The need for distributed version control systems (DVCS) was born.

In 2001, the first such system, GNU arch, was created. However, DVCSes only really became popular after the introduction of git and it’s use in Linux kernel development in 2005.

DVCS enable every developer to keep a local copy of (part of) the source code and create modifications, some of them which might later be shared with other developers, including the original ones.

Instead of version numbers, which cannot be easily kept unique in such a distributed setting, unique hashes were used to identify the versions. In Git, these hashes are derived from the parent’s hash (or hashes, in case of a merge; see image to the right), the new contents, as well as information about who changed what when and why. The committed version can also contain a PGP signature to authenticate the changes.

Again, we have a cryptographically linked chain (actually, Directed Acyclic Graph, DAG), potentially with signatures, and a protocol to efficiently distribute changes. This again can be used to create a public record.
In addition (and unlike the Blockchain approach which we will see below), diverging changes are common and a process („merge“) exists to combine the work done in each branch and not waste the efforts invested. This merge can often be done automatically.

Blockchain (2008)

The term Blockchain became known in 2008, born as a distributed ledger of pseudonymous financial transactions in the Bitcoin cryptocurrency. As in Git, each block includes the hash of its predecessor as a backward link (i.e., opposite to the logical ordering shown by the arrows in the image to the right, where time flows down).

To avoid blocks being chained randomly, a proof of work (PoW) is required: Only blocks containing a solution to a cryptographic puzzle will be considered as candidate blocks; the complexity of that puzzle is regularly adjusted to match the computational power in the network, keeping the average block creation (aka „mining“) rate at roughly 1 every 10 minutes.

If multiple candidate blocks are available as potential parents, only one is chosen; the work done in different branches is thrown away, even if these blocks/branches do not contain conflicting transactions.

The key to Bitcoin’s success was that apparent democratization through the proof of work. However, concerns about wasting energy have resulted in alternate designs, notably

In the former case, PoS, also calledplutocracy“ by some, the „democracy“ patina is diluted even further.

In the latter case, depending on the use case, often a much simpler mechanism may be used, resulting in easier-to-maintain systems. However, often the apparent sexyness of the „Blockchain“ name is more important. For example, when only a single organization controls the entire blockchain, this reverts back to the case of a single TSS, which Haber and Stornetta worked hard to avoid, as there is no guarantee that the system works as advertised. Even if it does, the lack of transparency, functionality, and battle-testing may even be much less than what is readily available with Git and friends.

On the right-hand side, you find a rendering of the qualitative complexity of different mechanisms that are embodied in the Bitcoin Blockchain. Integrity and transparency, the two main goals often associated with Blockchain usage, are very easy to fulfill. The incremental cost of immutability (or tamper-evidence) on top of these two is small. It is provably impossible to achieve perfect Consensus in asynchronous distributed systems such as the Internet; however, „good enough“ approximations can be achieved. Preventing double-spending of cryptocurrency is even harder. In many use cases, the easy three are enough, maybe with a sprinkle of very controlled consensus. Going for the big bubble is therefore mostly wasting resources and adding unnecessary complexity or preventing simple fixes in case of errors.

Apparently, even Blockchain solution vendors do not have confidence into any specific Blockchain. For example, the first proposed German solution to keeping track of vaccination certificate were to store information in five distinct Blockchains. Possible explanations for this and similar mission creep include

  • the lack of long-term trust in any of them,
  • the fear, that transaction costs might explode, or
  • the wish of project participants to promote their respective pet solutions.

„Blockchain“ is a heavily overloaded (and thus diluted) term. In its original form, it promised transparency and immutability, achieving that at huge amounts of energy wasted by people typically driven by greed. In what is sold to corporate users, it is often (1) unnecessarily complicated and error-prone, (2) not adapted to the actual problem, or (3) just little more than what Git and friends readily provide, but costly relabeled with a sexy buzzword.

Zeitgitter (2019)

As explained above, often all that is needed is

  • integrity,
  • transparency,
  • immutability (or tamper-evidence), and
  • maybe a limited, well-defined form of consensus.

This is achievable with simple timestamping. With the know-how from Haber-Stornetta, the PGP Digital Timestamping Service, and Git in mind, we created the open-source solution Zeitgitter.

The name comes from the German mental health term Zeitgitterstörung, which translates to confusion of the sense of time or chronotaraxis. So, Zeitgitter itself (without Störung, which is the confusion part) might be translated as the sense of time. It also includes Gitter, which means grid and indicates that the different timestampers are interwoven, mutually interlocking each other into having to say the truth about their timestamps.

The basic design is as follows: Each timestamping server issues digital signatures on content stored in Git repositories and stores proof of this timestamping in a Git repository of its own. In regular intervals, the Zeitgitter timestamping servers cross-timestamp each other, preventing backdating timestamps by more than a short interval.

Basic integration is trivial: Just call git timestamp whenever you want your standard Git repository timestamped (or automate the process). This easily fits in with the Git ecosystem and automated timestamping is easy.

Starting early, Zeitgitter was also used to cross-timestamp the PGP Digital Timestamping Service, which still issues around 300 timestamps daily, as a replacement of the now-defunct Usenet archival process.

In contrast to Blockchain-based approaches, already an inexpensive Raspberry Pi can issue several million timestamps per day, with annual cost for device, power, and Internet access being on the order of just a few €/$/CHF, avoiding the need for massive monetary returns.

PGP-based Zeitgitter results in normal signed objects in the repository. Therefore, timestamps can be stored, presented, verified, or propagated like ordinary Git objects; fully decentralized. Blockchain-based timestamping approaches, however, require a centralized gatekeeper to verify or help verify the timestamp; i.e., the timestamp is not self-contained.

Zeitgitter addresses the most common need, namely timestamping, to efficiently ascertain integrity, transparency, and detect tampering. When it comes to issuing timestamps, the requirements for global consensus are very easy to satisfy, unlike many other applications.
If a single authority would like to add these three features, this is all of Blockchain that is needed.

Summary

Most often, „Blockchain“ is sold as the solution for all your problems around Digitalisation of your business or administration. However, often the problems lie deeper: The lack of defined processes, missing standardized interchange formats, or even just not knowing well enough what the real goals of the entire digitalisation project should be.

These all-too-common problems resulted in the adage,

If you think that you need a Blockchain: By the time that you are ready to use it, you probably don’t need it anymore.

Even if some of the features are actually required, probably all that you need is timestamping, which can be much more easily achieved using RFC 3161 (centralized) timestamping protocol or Zeitgitter as a distributed version.

An extended version of this article, with a slightly different focus, will be presented in German at DigiGes Winterkongress 2022.

Updates

  • 2022-03-06: At the beginning of Haber/Stornetta, elaborated on the relationship between digital signatures and timestamps (and mentioned RFC 3161).

Schreibe einen Kommentar

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.