Scalability

Scalability

One common concern about Ethereum is the issue of scalability. Like Bitcoin, Ethereum suffers from the flaw that every transaction needs to be processed by every node in the network. With Bitcoin, the size of the current blockchain rests at about 15 GB, growing by about 1 MB per hour. If the Bitcoin network were to process Visa's 2000 transactions per second, it would grow by 1 MB per three seconds (1 GB per hour, 8 TB per year). Ethereum is likely to suffer a similar growth pattern, worsened by the fact that there will be many applications on top of the Ethereum blockchain instead of just a currency as is the case with Bitcoin, but ameliorated by the fact that Ethereum full nodes need to store just the state instead of the entire blockchain history.

The problem with such a large blockchain size is centralization risk. If the blockchain size increases to, say, 100 TB, then the likely scenario would be that only a very small number of large businesses would run full nodes, with all regular users using light SPV nodes. In such a situation, there arises the potential concern that the full nodes could band together and all agree to cheat in some profitable fashion (eg. change the block reward, give themselves BTC). Light nodes would have no way of detecting this immediately. Of course, at least one honest full node would likely exist, and after a few hours information about the fraud would trickle out through channels like Reddit, but at that point it would be too late: it would be up to the ordinary users to organize an effort to blacklist the given blocks, a massive and likely infeasible coordination problem on a similar scale as that of pulling off a successful 51% attack. In the case of Bitcoin, this is currently a problem, but there exists a blockchain modification suggested by Peter Todd which will alleviate this issue.

In the near term, Ethereum will use two additional strategies to cope with this problem. First, because of the blockchain-based mining algorithms, at least every miner will be forced to be a full node, creating a lower bound on the number of full nodes. Second and more importantly, however, we will include an intermediate state tree root in the blockchain after processing each transaction. Even if block validation is centralized, as long as one honest verifying node exists, the centralization problem can be circumvented via a verification protocol. If a miner publishes an invalid block, that block must either be badly formatted, or the state S[n] is incorrect. Since S[0] is known to be correct, there must be some first state S[i] that is incorrect where S[i-1] is correct. The verifying node would provide the index i, along with a "proof of invalidity" consisting of the subset of Patricia tree nodes needing to process APPLY(S[i-1],TX[i]) -> S[i]. Nodes would be able to use those nodes to run that part of the computation, and see that the S[i] generated does not match the S[i] provided.

Another, more sophisticated, attack would involve the malicious miners publishing incomplete blocks, so the full information does not even exist to determine whether or not blocks are valid. The solution to this is a challenge-response protocol: verification nodes issue "challenges" in the form of target transaction indices, and upon receiving a node a light node treats the block as untrusted until another node, whether the miner or another verifier, provides a subset of Patricia nodes as a proof of validity.


Blockchain Scalability

One of the largest problems facing the cryptocurrency space today is the issue of scalability. It is an often repeated claim that, while mainstream payment networks process something like 2000 transactions per second, in its current form the Bitcoin network can only process seven. On a fundamental level, this is not strictly true; simply by changing the block size limit parameter, Bitcoin can easily be made to support 70 or even 7000 transactions per second. However, if Bitcoin does get to that scale, we run into a problem: it becomes impossible for the average user to run a full node, and full nodes become relegated only to that small collection of businesses that can afford the resources. Because mining only requires the block header, even miners can (and in practice most do) mine without downloading the blockchain.

The main concern with this is trust: if there are only a few entities capable of running full nodes, then those entities can conspire and agree to give themselves a large number of additional bitcoins, and there would be no way for other users to see for themselves that a block is invalid without processing an entire block themselves. Although such a fraud may potentially be discovered after the fact, power dynamics may create a situation where the default action is to simply go along with the fraudulent chain (and authorities can create a climate of fear to support such an action) and there is a coordination problem in switching back. Thus, at the extreme, Bitcoin with 7000 transactions per second has security properties that are essentially similar to a centralized system like Paypal, whereas what we want is a system that handles 7000 TPS with the same levels of decentralization that cryptocurrency originally promised to offer.

Ideally, a blockchain design should exist that works, and has similar security properties to Bitcoin with regard to 51% attacks, that functions even if no single node processes more than 1/n of all transactions where n can be scaled up to be as high as necessary, although perhaps at the cost of linearly or quadratically growing secondary inefficiencies and convergence concerns. This would allow the blockchain architecture to process an arbitrarily high number of TPS but at the same time retain the same level of decentralization that Satoshi envisioned.

Problem: create a blockchain design that maintains Bitcoin-like security guarantees, but where the maximum size of the most powerful node that needs to exist for the network to keep functioning is substantially sublinear in the number of transactions.

Scalability in bitcoin

VISA handles on average around 2,000 transactions per second (tps), so call it a daily peak rate of 4,000 tps. It has a peak capacity of around 56,000 transactions per second, [[1]](https://usa.visa.com/dam/VCOM/download/corporate/media/visa-fact-sheet-Jun2015.pdf) however they never actually use more than about a third of this even during peak shopping periods. [2]

PayPal, in contrast, handled around 10 million transactions per day for an average of 115 tps in late 2014. [3]

Let's take 4,000 tps as starting goal. Obviously if we want Bitcoin to scale to all economic transactions worldwide, including cash, it'd be a lot higher than that, perhaps more in the region of a few hundred thousand tps. And the need to be able to withstand DoS attacks (which VISA does not have to deal with) implies we would want to scale far beyond the standard peak rates. Still, picking a target let us do some basic calculations even if it's a little arbitrary.

Today the Bitcoin network is restricted to a sustained rate of 7 tps due to the bitcoin protocol restricting block sizes to 1MB.


Scalability, Part 1: Building on Top

Ethereum Scalability and Decentralization Updates

How do I compare the “scalability” capabilities between ethereum and bitcoin?

Let me try to explain.

Bitcoin Block Size Limit

The Bitcoin side is pretty simple to understand. The bitcoin blockchain has a hardcoded block size limit of 1 MiB. With an average transaction size of around 600 B and a target block time of 10 minutes, you get

1024 * 1024 / 600 B = 1747.7 transactions per block,

which translates down to

1747.7 / 600 s = 2.9127 transactions per second.

Here we are at around 3 transactions per second in practice, however, if you reduce the average transaction size, it's possible in theory to reach higher rates, maybe 7 transactions per second? That said, there is nothing that scales in Bitcoin unless the network finds consensus on a solution to increase the blocksize or any other scalability fix.

Ethereum Block Gas Limit

Ethereum introduces a new concept which has no transaction or block size limit but a gas limit. Gas is a unit which basicly calculates fee costs. Every transaction, every contract execution and every data storage operation on the blockchain costs gas. Every block has a block gas limit of default 4,712,388 gas which can be spent on every block.

Let's assume an average transaction size of 21,000 gas per transaction which is required for a default value transfer and a target block time of 15 seconds, we have by default the following:

4712388 / 21000 = 224.4 transactions per block

which translates down to

224.4 / 15 = 14.96 transactions per second.

So, at the current level of gas block limit and block time, there is a default possible throughput of 15 transactions per second. If you increase the required gas per transaction, it's probably a little bit lower.

But since you asked about scalability, the yellow paper specifies in equations 44-46 how the block gas limit scales:

Which basicly means the block gas limit can increase by 1+1/1024 each block, or:

(1+1/1024)^5760 = 276.51227240329152144804

which is a scalability factor for 276 per day. Ethereum scales indefinitely. In theory, in practice, the early olympic testnet was able to stress the network to levels at around 25 transactions per second. And this is only the status quo. See also this post about transaction size.

How does Ethereum deal with blockchain scalability?

Ethereum blocks are limited by the block gas limit (currently around 4.7 million gas). Each transaction specifies how much gas it's willing to spend. A block can only fit as much as the block gas limit, so if someone specifies a transaction of 4.7 million gas, a miner cannot fit any more transactions in that block.

So you can see some differences against Bitcoin. Another important one, is dynamic behavior that every time a block is mined, the miner of that block can nudge the block gas limit (BGL) either higher or lower (from the previous block gas limit), by a factor of 1/1024. For example if the current BGL is 1024, the miner of the next block can set the BGL to be as low as 1023, as high as 1025, or somewhere in between.

Other scalability challenges:

Above is about on-chain scalability. A complimentary approach to scalability is to do things off-the-blockchain while being able to still use the blockchain when necessary. Examples:

For more current discussions, see the live research and EIP channels. And keep an eye on the Ethereum Improvement Proposals.

EIP 103 (Serenity): Blockchain rent

Ethereum Announces “Unlimited” Scalability Roadmap

How many transactions per second are the devs planning for?

scalability_paper

Toward a 12-second Block Time

Number of transactions per second for payments

7 Transactions Per Second? Really?

Where can I find transactions per second statistics?

How are we comparing 'transactions per second'?

Ethereum difficulty adjustment algorithm

How does the Ethereum Homestead difficulty adjustment algorithm work?

Summary

If the timestamp difference (block_timestamp - parent_timestamp) is:

  • < 10 seconds, the difficulty is adjusted upwards by parent_diff // 2048 * 1
  • 10 to 19 seconds, the difficulty is left unchanged
  • >= 20 seconds, the difficulty is adjust downwards proportional to the timestamp difference, from parent_diff // 2048 * -1 to a max downward adjustment of parent_diff // 2048 * -99

This is consistent with the statement from ethdocs.org - Ethereum Homestead - The Homestead Release:

EIP-2/4 eliminates the excess incentive to set the timestamp difference to exactly 1 in order to create a block that has slightly higher difficulty and that will thus be guaranteed to beat out any possible forks. This guarantees to keep block time in the 10-20 range and according to simulations restores the target 15 second blocktime (instead of the current effective 17s).

And from Ethereum Network Status, the average block time currently is 13.86 seconds.


Details

The difficulty adjustment formula:

block_diff = parent_diff + parent_diff // 2048 * 
max(1 - (block_timestamp - parent_timestamp) // 10, -99) + 
int(2**((block.number // 100000) - 2))

where // is the integer division operator, eg. 6 // 2 = 3, 7 // 2 = 3, 8 // 2 = 4.

can be broken down into the following parts:

Sub-formula B - The difficulty bomb part, which increases the difficulty exponentially every 100,000 blocks.

+ int(2**((block.number // 100000) - 2))

The difficulty bomb won't be discussed here as it is already covered in the following Q&As:

Sub-formula A - The difficulty adjustment part, which increases or decreases the block difficulty depending on the time between the current block timestamp and the parent block timestamp:

+ parent_diff // 2048 * max(1 - (block_timestamp - parent_timestamp) // 10, -99)

Subformula A1 - Lets separate out part of Subformula A

+ max(1 - (block_timestamp - parent_timestamp) // 10, -99)

and consider what the adjustment effect is due to the timestamp difference between the current block and the parent block:

When (block_timestamp - parent_timestamp) is

  • 0, 1, 2, ..., 8, 9 seconds
    • A1 evaluates to max(1 - 0, -99) = 1
    • A evaluates to +parent_diff // 2048 * 1
  • 10, 11, 12, ..., 18, 19 seconds
    • A1 evaluates to max(1 - 1, -99) = 0
    • A evaluates to +parent_diff // 2048 * 0
  • 20, 21, 22, ..., 28, 29 seconds
    • A1 evaluates to max(1 - 2, -99) = -1
    • A evaluates to +parent_diff // 2048 * -1
  • 30, 31, 32, ..., 38, 39 seconds
    • A1 evaluates to max(1 - 3, -99) = -2
    • A evaluates to +parent_diff // 2048 * -2
  • 1000, 1001, 1002, ..., 1008, 1009 seconds
    • A1 evaluates to max(1 - 100, -99) = -99
    • A evaluates to +parent_diff // 2048 * -99
  • > 1009 seconds
    • A1 evaluates to max(1 - {number greater than 100}, -99) = -99
    • A evaluates to +parent_diff // 2048 * -99

So, if the timestamp difference (block_timestamp - parent_timestamp) is:

  • < 10 seconds, the difficulty is adjusted upwards by parent_diff // 2048 * 1
  • 10 to 19 seconds, the difficulty is left unchanged
  • >= 20 seconds, the difficulty is adjust downwards proportional to the timestamp difference, from parent_diff // 2048 * -1 to a max downward adjustment of parent_diff // 2048 * -99

The Source Code

From Go Ethereum - core/block_validator.go, lines 264-311:

func calcDifficultyHomestead(time, parentTime uint64, parentNumber, parentDiff *big.Int) *big.Int {
    // https://github.com/ethereum/EIPs/blob/master/EIPS/eip-2.mediawiki
    // algorithm:
    // diff = (parent_diff +
    //         (parent_diff / 2048 * max(1 - (block_timestamp - parent_timestamp) // 10, -99))
    //        ) + 2^(periodCount - 2)
    
    bigTime := new(big.Int).SetUint64(time)
    bigParentTime := new(big.Int).SetUint64(parentTime)
    
    // holds intermediate values to make the algo easier to read & audit
    x := new(big.Int)
    y := new(big.Int)
    
    // 1 - (block_timestamp -parent_timestamp) // 10
    x.Sub(bigTime, bigParentTime)
    x.Div(x, big10)
    x.Sub(common.Big1, x)
    
    // max(1 - (block_timestamp - parent_timestamp) // 10, -99)))
    if x.Cmp(bigMinus99) < 0 {
        x.Set(bigMinus99)
    }
    
    // (parent_diff + parent_diff // 2048 * max(1 - (block_timestamp - parent_timestamp) // 10, -99))
    y.Div(parentDiff, params.DifficultyBoundDivisor)
    x.Mul(y, x)
    x.Add(parentDiff, x)
    
    // minimum difficulty can ever be (before exponential factor)
    if x.Cmp(params.MinimumDifficulty) < 0 {
        x.Set(params.MinimumDifficulty)
    }
    
    // for the exponential factor
    periodCount := new(big.Int).Add(parentNumber, common.Big1)
    periodCount.Div(periodCount, ExpDiffPeriod)
    
    // the exponential factor, commonly referred to as "the bomb"
    // diff = diff + 2^(periodCount - 2)
    if periodCount.Cmp(common.Big1) > 0 {
        y.Sub(periodCount, common.Big2)
        y.Exp(common.Big2, y, nil)
        x.Add(x, y)
    }
    
    return x
}

Change difficulty adjustment to target mean block time including uncles

Specification

Currently, the formula to compute the difficulty of a block includes the following logic:

adj_factor = max(1 - ((timestamp - parent.timestamp) // 10), -99)
child_diff = int(max(parent.difficulty + (parent.difficulty // BLOCK_DIFF_FACTOR) * adj_factor, min(parent.difficulty, MIN_DIFF)))
...

If block.number >= METROPOLIS_FORK_BLKNUM, we change the first line to the following:

adj_factor = max(1 + len(parent.uncles) - ((timestamp - parent.timestamp) // 9), -99)

Specification (1b)

adj_factor = max((2 if len(parent.uncles) else 1) - ((timestamp - parent.timestamp) // 9), -99)

Rationale

This new formula ensures that the difficulty adjustment algorithm targets a constant average rate of blocks produced including uncles, and so ensures a highly predictable issuance rate that cannot be manipulated upward by manipulating the uncle rate. The formula can be fairly easily seen to be (to within a tolerance of ~3/4194304) mathematically equivalent to assuming that a block with k uncles is equivalent to a sequence of k+1 blocks that all appear with the exact same timestamp, and this is likely the simplest possible way to accomplish the desired effect.

Changing the denominator from 10 to 9 ensures that the block time remains roughly the same (in fact, it should decrease by ~3% given the current uncle rate of 7%).

(1b) accomplishes almost the same effect but has the benefit that it depends only on the block header (as you can check the uncle hash against the blank hash) and not the entire block.


Is it possible to change the block target time?

What was the first block mined with Homestead?

How is the Mining Difficulty calculated on Ethereum?

How do I decrease the difficulty on a private testnet?

How to make Ethereum mining difficulty static for a private chain?

eip-2.mediawiki

Genesis block Explanation

mixhash A 256-bit hash which proves, combined with the nonce, that a sufficient amount of computation has been carried out on this block: the Proof-of-Work (PoW). The combination of nonceand mixhash must satisfy a mathematical condition described in the Yellowpaper, 4.3.4. Block Header Validity, (44). It allows to verify that the Block has really been cryptographically mined, thus, from this aspect, is valid.

nonce A 64-bit hash, which proves, combined with the mix-hash, that a sufficient amount of computation has been carried out on this block: the Proof-of-Work (PoW). The combination of nonceand mixhash must satisfy a mathematical condition described in the Yellowpaper, 4.3.4. Block Header Validity, (44), and allows to verify that the Block has really been cryptographically mined and thus, from this aspect, is valid. The nonce is the cryptographically secure mining proof-of-work that proves beyond reasonable doubt that a particular amount of computation has been expended in the determination of this token value. (Yellowpager, 11.5. Mining Proof-of-Work).

difficulty A scalar value corresponding to the difficulty level applied during the nonce discovering of this block. It defines the mining Target, which can be calculated from the previous block’s difficulty level and the timestamp. The higher the difficulty, the statistically more calculations a Miner must perform to discover a valid block. This value is used to control the Block generation time of a Blockchain, keeping the Block generation frequency within a target range. On the test network, we keep this value low to avoid waiting during tests, since the discovery of a valid Block is required to execute a transaction on the Blockchain.

alloc Allows defining a list of pre-filled wallets. That’s an Ethereum specific functionality to handle the “Ether pre-sale” period. Since we can mine local Ether quickly, we don’t use this option.

coinbase The 160-bit address to which all rewards (in Ether) collected from the successful mining of this block have been transferred. They are a sum of the mining reward itself and the Contract transaction execution refunds. Often named “beneficiary” in the specifications, sometimes “etherbase” in the online documentation. This can be anything in the Genesis Block since the value is set by the setting of the Miner when a new Block is created.

timestamp A scalar value equal to the reasonable output of Unix time() function at this block inception. This mechanism enforces a homeostasis in terms of the time between blocks. A smaller period between the last two blocks results in an increase in the difficulty level and thus additional computation required to find the next valid block. If the period is too large, the difficulty, and expected time to the next block, is reduced. The timestamp also allows verifying the order of block within the chain (Yellowpaper, 4.3.4. (43)).

parentHash The Keccak 256-bit hash of the entire parent block header (including its nonce and mixhash). Pointer to the parent block, thus effectively building the chain of blocks. In the case of the Genesis block, and only in this case, it’s 0.

extraData An optional free, but max. 32-byte long space to conserve smart things for ethernity. :)

gasLimit A scalar value equal to the current chain-wide limit of Gas expenditure per block. High in our case to avoid being limited by this threshold during tests. Note: this does not indicate that we should not pay attention to the Gas consumption of our Contracts.

difficulty: QUANTITY - integer of the difficulty for this block.
totalDifficulty: QUANTITY - integer of the total difficulty of the chain until this block.


Blocktime - Investigating Ethereum Blocktime with R

What is the measured distribution of block times since Homestead?

Pending Transactions

Is it normal pending transaction are removed after restart of geth?

How to make miner to mine only when there are Pending Transactions?

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 158,736评论 4 362
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 67,167评论 1 291
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 108,442评论 0 243
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 43,902评论 0 204
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,302评论 3 287
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,573评论 1 216
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,847评论 2 312
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,562评论 0 197
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,260评论 1 241
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,531评论 2 245
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 32,021评论 1 258
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,367评论 2 253
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 33,016评论 3 235
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,068评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,827评论 0 194
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,610评论 2 274
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,514评论 2 269

推荐阅读更多精彩内容