Ethereum 2.0 | What’s the big deal?

Article Tuesday, May 18 2021

Blockchain at a high level

In order to provide context, it is important to recognize some basic characteristics of a blockchain. Before we talk about Ethereum, we need to step back and discuss what a blockchain actually is and what are all of these terms that are being thrown around when we talk about blockchain. Let’s start with some basic definitions.

Blockchain: a cryptographically secure distributed ledger system with a shared state in which each state is dependent upon the previous one, making it difficult to modify the previous state.

Cryptographically secure: something that is secured using mathematical proofs and algorithms that are virtually impossible to crack.

Shared State: The shared state of a blockchain is the state in which the majority of the nodes within the network agree is the correct state. This is usually referred to as a consensus, and there are various methods of implementing consensus algorithms and protocols, each with their pros and cons. Follow this link for a basic overview of the core group of mechanisms that are used today.

State Machine: It is an abstract machine that can be in exactly one of a finite number of states at any given time.[4]

What is Ethereum?

Ethereum is a “transaction-based state machine”[2], like other projects. However, Ethereum is different because it is a general purpose blockchain[1]. In order to achieve this general purpose property, Ethereum enables smart contracts to be written on the Ethereum network. These scripts of code live on the blockchain and are executed autonomously or are triggered from external calls. It is important to note that once these pieces of code are on the blockchain, they are there forever. With these additional features in mind, a better analogy for encapsulating the essence of Ethereum is a “distributed state machine”[3]. The state of Ethereum is a large data structure that holds a machine state(accounts, balances, code, etc.). This state can change from block to block according to a predefined set of rules, and can execute arbitrary machine code.

EVM Architecture

Ethereum’s virtual machine has a state transition function[3] which abides by a deterministic mathematical function: given an input, it will produce a deterministic output for the state. This state transition function is triggered by transactions being executed.

A simple model of a transaction in the state machine

A transaction is a single cryptographically-signed instruction constructed by an external entity(outside the scope of Ethereum). The two types of transactions are those resulting in message calls and those resulting in the creation of new accounts associated with code ( i.e., smart contracts)[1]. These transactions are batched into larger packets of data called “blocks”. Each of these blocks have a header, which holds relevant information corresponding to the transactions within the block. Each block must be validated through a consensus mechanism before it is added to the chain.(To learn more about the types of consensus mechanisms on the Ethereum Blockchain, click this link here). Currently, Ethereum can process approximately 500,000 transactiosn per day, and if at full capacity, at a rate of 13 transactions per second[7].

In order to execute these transactions in the EVM, we must use Gas.

What is gas?

In Ethereum, gas is a unit that measures the amount of computational effort required to execute specific operations on the Ethereum network[5]. This means that each transaction requires a transaction fee in order to successfully complete a transaction.

Why does gas exist?

Gas fees are a security measure for the Ethereum network. The fee requirement for every computation executed deters bad actors from overwhelming the network with an arbitrarily large amount of calls (spam!). Additionally, each transaction requires a limit for how many computational steps of code can be executed, preventing infinite while loops or other computational waste from being put in the code. Any gas left over from a successful transaction is returned to the caller of the transaction. Gas is denominated in Ethereum’s native currency, Ether (ETH) and can be broken down into smaller denominations as well (e.g., gwei). Here is a chart of common ether based units.

The Ethereum 1.0 network is a proof-of-work based network. This means that most of the computation during transactions comes from “miners”. Miners can be rewarded for their service in the form of ether block rewards and transaction fees from gas payments[6]. The gas prices are dependent on two things: the requested gas price of the user and the miner’s willingness to accept the gas price presented to them. When there are more transactions being requested by users, the gas prices rise as block-space becomes more scarce. As Ethereum has grown in popularity, the transaction rates have only gone up. This has led to congestion of the network, leading to issues with scalability. An infamous example of this is the CryptoKitties Congestion Crisis back in December of 2017.

Why is this relevant to Ethereum 2.0?

Ethereum 2.0 is bringing a series of improvements to the network that address network efficiency, scalability, sustainability, and versatility. One of the main problems of Ethereum 2.0 seeks to solve the “Scalability Trilemma”. This trilemma can be broken into three main components: security, decentralization, and scalability. The ability to achieve these three properties without compromising the others is a problem within the blockchain space. Some of the key new features of Ethereum 2.0 will be:

A move from Proof of Work to a Proof of Stake consensus mechanism
The introduction of shard chains into the network
The rolling out of EIP 1559 (Ethereum Improvement Proposal 1559)

Proof of Work vs. Proof of Stake

Proof of Work (PoW) requires miners to go through an intense race of trial and error to find the nonce for a block; blocks with a valid nonce can be added to the chain. This process can be very energy intensive. The security of this model is that miners are not incentivized to start their own chain because it undermines the system (no pun intended). Here’s a visual representation of the security model for Bitcoin as the example blockchain:

Blockchain’s in general rely on having a single state of truth, and users of the blockchain will always choose the longest chain or heaviest subchain (Ethereum uses something called GHOST). The new hash of each block is also dependent on the hash of the previous block, making it easy to detect fraudulent transactions. In order for a miner to keep adding malicious, but still valid, blocks they would need over 51% of the network mining power to beat everyone else. As described earlier, mining is very energy intensive, meaning that one would need an obscenely large amount of computing power; the amount of energy spent could potentially outweigh any benefit from achieving a 51% attack. This makes the likelihood of an attack on the network very low. The issue with this model, however, is that proof of work runs on a large amount of energy output incentivized by the massive rewards for miners. The size of the mining network must be so large that attacks are virtually impossible. A network must expend Y amount of energy in order to defend against an attacker of size Y; this 1:1 cost ratio for attack and defense is not in line with the ideals of the crypto space — there is no advantage for the defender.

This is where Ethereum 2.0’s Proof of Stake model shines. Ethereum 2.0 is moving to a Proof of Stake(PoS) model as the consensus mechanism for the network. This model enables a secure, decentralized consensus mechanism for the network with less energy needed to do so and it breaks the 1:1 defense-attack cost ratio. Anyone who stakes at least 32 ETH can become a validator node. These nodes don’t need to mine blocks; they only need to create blocks when chosen and validate proposed blocks when they’re not. Validators are rewarded for creating new blocks and for attesting to proposed blocks; if they attest to a malicious block, they lose their stake. The security of this model comes from the financial commitment by those who are willing to provide security to the network and the economic penalties to those who attempt to attack the network. Game theory economics supports that those who have a large amount of skin in the game, have no incentive to attack the system because they will lose their stake as well as a source of passive income from validating blocks. Furthermore, if there is a successful 51% take over of the PoS chain through majority collusion, which is highly unlikely, all the community needs to do is hard fork the chain and delete the malicious validators.

Shard Chains

With growing activity on the network, Ethereum needs to be able to handle more transactions per second without increasing the node size. Nodes are critical participating components which store and run the blockchain. Increasing the node size is not a practical solution because only those with expensive and powerful computers could do it. For scalability to be achieved, there must be more transactions per second combined with more nodes on the network; when there are more nodes on the network, there is more security as well. Sharding is the process of splitting a database horizontally to distribute the load. Within the context of Ethereum, sharding will help scale the network through the use of shard chains. Shard chains will reduce the network congestion and increase the transaction rate. This offloading of data to multiple chains allows for better scalability, meaning more network participation; you will eventually be able to run Ethereum on personal devices such as a laptop or phone. This increases security because the network will be more decentralized which makes the attack surface area smaller. Sharding will create a low barrier to entry for running clients on your own without relying on third party services, which reduces points of failure in the network.

EIP 1559

EIP 1559 (Ethereum Improvement Proposal 1559) will change how users buy transactions on Ethereum. In the current Ethereum model, users place bids for block space by submitting a gas price they are willing to pay, and miners pick up transactions they wish to include in the next block; miners usually choose transactions with a higher gas price to make the maximum profit from each block. These transaction bids can lead to users paying more than is needed for their transactions. When network congestion is high, fees are high, meaning miner incentive is increased. However, when congestion increases, this does not mean security demand goes up at the same rate, which means Ethereum has to spend more for security than necessary. This is inefficient and has a negative impact on ETH holders because they are supporting these transactions.

EIP 1559 introduces two new concepts: the BaseFee and the Miner tip. The BaseFee is the minimum fee required for a transaction to be included in a block; this fee can be adjusted per block(+/- 12.5% the previous block’s fee) based on network congestion of Ethereum; the reasoning for the adjustment is to allow the network to reach equilibrium by accommodating for network utilization. The miner tip is a separate fee that can be paid to incentivize miners to prioritize a transaction. There is also the introduction of larger block sizes, setting the max capacity to 25M GWEI from 12.5M GWEI previously.

The transaction fees will no longer be paid to the miners, and instead will be burned, making transaction fees more predictable and reducing transaction time. The miners will make their profit from the miner tip and block rewards.

This is a major shift in the current transaction paradigm on Ethereum because it will smoothen out the network fees, making Ethereum more scalable while still remaining secure. Let’s look at an example of the network at high congestion on Ethereum 2.0.

When there is a surge in network activity (i.e., high transaction demand from users), the BaseFee will be increased. Eventually the BaseFee will be high enough that it disincentivizes users from transacting, bringing the network utilization back to equilibrium. You may ask why minors cannot artificially increase the BaseFee. The answer to this is the burning of the BaseFee. In EIP 1559, miners will only be able profit from block rewards and miner tips. When the BaseFee becomes too high, users are less incentivized to pay the BaseFee as well as the miner tip. Hence, miners have no incentive to raise the BaseFee artificially. In addition, users are able to specify a Fee Cap; this is the maximum they want to pay for a transaction. When EIP-1559 is launched, ETH holders will still pay block subsidies, but when network congestion is high, they will not need to pay for security because the demand of transactors pays for it. They are implicitly refunded through the burning of the BaseFee because this will increase the value of the ETH they still hold.

What does this mean for Ethereum users?

Security that is both economically and energy efficient

The PoS model to which Ethereum is migrating offers an economically efficient, flexible way to validate blocks by aligning economic incentives of validator nodes; in the words of Vitalik Buterin, “security comes from putting up economic value-at-loss”. Since PoS is also highly energy efficient, it promotes more geographic decentralization than the PoW; GPU mining and ASIC mining are both very easy to detect because they require large amounts of electricity consumption, expensive hardware purchases and large warehouses. In contrast, PoS staking can be done on a regular laptop; this also makes PoS more censorship resistant as it is harder to track the nodes of validators than to track large mining operations of PoW. Although some argue that PoS is a “rich get richer” scheme, the reality is that the cost for staking is comparatively low (32ETH) next to the costs of ASIC mining where external resources are required, and is even more in favor of the rich who have the resources to support them. Furthermore, the rewards in PoS are quite low, about 0.5–2% of total ETH supply.

A Better Experience

Due to improvements in the transaction fee mechanism, causing a decrease in cost and faster transaction time, users will have a better overall experience using the network. The importance of the User Experience can not be understated. In order for there to be mass adoption, the users must enjoy using the technology.

Economically Beneficial

In addition to a better UX, the burning of the BaseFee also internalizes revenue that currently goes to miners, meaning ETH will become a productive asset since it is a required consumable for transactions. Moreover, the burning of ETH lowers inflation, potentially making ETH a deflationary asset, legitimizing it as a store of value. The state of Ether the asset will be more scarce as a liquid asset; most of the ETH will be either locked up in smart contracts and DeFi or it will be burned through transactions. For more information on the economic implications of ETH listen to this podcast here.

Developing at Scale

Lower transaction fees and chain sharding bode well for the Ethereum Developer community. These new features will enable developers to build decentralized protocols at scale without sacrificing security, and bring more innovation to the space. Alos, low transaction fees further democratizes who can build on Ethereum because the cost of deploying contracts to the network and running them will be much lower.

Closing Thoughts

There are tradeoffs to this new change and I recommend that you research these in order to have a well-rounded perspective. There are arguments and differing opinions around some of the concepts described in this article. Regardless, it is an exciting time for the Ethereum community and the Crypto-Blockchain community writ large. This article only scratches the surface of the space and I encourage readers to “go down the rabbit hole” themselves and see what they find. There is constant innovation in space and the more you learn, the more you will love it.