More work on docs

This commit is contained in:
Mariano Sorgente 2019-11-25 13:45:48 +09:00
parent 9110aedf66
commit 32f1262bb5
11 changed files with 322 additions and 0 deletions

1
.gitignore vendored
View File

@ -10,6 +10,7 @@ __pycache__/
# Database
nohup.out
mongod.log*
# Keys and plot files
config/keys.yaml

View File

@ -0,0 +1,96 @@
# Part 1: Consensus Algorithm Summary
The Chia blockchain and consensus algorithm aims to provide a more environmentally,
decentralized, and secure alternative to proof of work and proof of stake, while
maintaing some of the key properties that make Nakamoto consensus desireable. The full
description of the algorithm can be seen in the [Chia Network greenpaper](https://www.chia.net/assets/ChiaGreenPaper.pdf).
The main idea is that mining nodes called **Farmers** (as opposed to Bitcoin's miners), use
their disk space to compete on finding blocks. Whereas in Bitcoin, owning 5% of the hashpower, or
CPU power allows you to win 5% of the blocks, in Chia having 5% of the allocated hard drive
*space* will allow you to win these blocks. Furthermore, coinbase rewards and fees in blocks
are given to the farmers (and/or pools).
This allocated hard drive space is stored in a series of files referred to as "plots".
Plots are lookup tables filled with
hashes and pointers, which allow farmers to efficiently look up and find proofs of space, cryptographic
proofs of storage of data. Farmers can create plots through the plotting process, which can take
days of intesive CPU and disk usage, but after that, they can farm with almost no cpu or electricity
usage. More information on proofs of space and the exact construction can be found [here](https://github.com/Chia-Network/proof-of-space).
Whenever a new block is propagated through the network, all farmers check their hard drives to
see if they have any very good proofs of space (analogous to someone checking their bingo card to
see if they've won), and propagate these proofs if they've found a lucky number. Farmers also propagate a block
with these proofs, and sign it with a private key that is associated with their plot.
In order to prevent grinding attacks, these proofs of space must be put through a proof of time as well.
Each block has one proof of space and one proof of time.
**Proofs of time**, or verifiable delay function proofs, are cryptographic proofs that a sequential
computation was performed on a given input, for a given number of iterations. These proofs of
time create time between blocks, and make generating an alternative blockchain very take time. The nodes
that create proofs of time are called **Timelords**, and they don't get any rewards for doing this. The
idea is that they help the network operate, and as long as there is one honest timelord that is close
enough to the fastest timelord, then the grinding resistance is preserved.
A block which does not yet have a proof of time on it, is called an **unfinished block**.
## Difficulty and Iterations
In Chia, there is also a difficulty parameter which is a constant factor that can increase or decrease
the number of iterations required for the proof of time, and therefore change expected block times.
The formula for the proof of space iterations required to finish a block is below:
```
Iterations required = 30 * ips + difficulty * -ln(0.H(qual_str)) / expected_plot_size(k)
```
* **ips** is the estimated iterations per second of the timelords in the network. This is calculated as
the total iterations in the previous epoch, divided by the total time elapsed in that epoch. Epochs are groups of
2048 blocks starting at block 0 (genesis). The ips is only used from block i+512 where i is the start of a
new epoch (similarly to difficulty).
Note that the 30 * ips factor is a constant 30 seconds that is always necessary.
This allows farmers some time to fetch all their qualities and proofs from disk.
* **difficulty** is a number that is also changed every epoch, starting at block i+512 where i%2048 is 0.
The difficulty parameter allows us to increase or decrease the number of iterations, in order to get closer
to the target block time of 2.5 minutes. If blocks came much faster or much slower than expected in the
previous epoch, the difficulty is adjusted based on the formula in the greenpaper. Source code is in src/blockchain.py.
The difficulty is increased regardless of which component improved, the space or the time. If a large farmer
came into the network, blocks will come faster and thus increase the difficulty. Same thing if a faster
timelord joins the network.
* **-ln(x)** negative log is applied to a number between 0 and 1, to make proportion of space in the network
equal to the proportion of blocks won. This is computed using a simple Pade approximation for log, which is
very accurate for values close to 1.
* **0.H(x)** this is a conversion from a hash (32 byte sha256 output) to a value between 0 and 1, by taking
all the bits of the hash "xxxxx.." and representing a decimal in binary as 0.xxxxx... In the code, we
actually skip this step and use integers directly in the Pade approximation, to avoid floating points.
* **qual_str** is the quality string, is a variable sized bytestring that can be efficiently retrieved from
the plot (it's actually a subset of the proof of space). This allows farmers to efficiently check whether
a proof of space is "good" (requires low iterations), without fetching the whole proof from disk. A quality
lookup takes around 50ms on a slow HDD, while a proof of space lookup takes around 500ms. Note that this
is like finding a good hash in Bitcoin, but does not require elecricity, and can be done instantly.
* **k** is an integer between 30 and 59 which determines the size of a plot.
* **expected_plot_size** is a function from k to the number of bytes on disk to store a plot of that size.
Increasing k by one roughly doubles the size of the plot.
Whenever a farmer sees a new block in the network, she retrieves the quality and computes the iterations,
which when divided by ips, yields the expected time to finalize that block. If this number is close enough
to the expected block time (2.5 minutes), the entire proof of space is fetched from disk, the unfinished
block is creaed, and then it is propagated through the network.
As farmers in the network receive new blocks, and find their qualities and proofs of space, they
will propagate them and timelords will receive them. The proofs of space on block i will determine
the number of iterations that the proof of time on block i must have, but it is not included in
the challenge for block i. Therefore the timelord can start iterating on their VDF as soon as block
i-1 is finalized by another timelord. More information is given in the [greenpaper](https://github.com/Chia-Network/proof-of-space).
...

69
docs/2-block-format.md Normal file
View File

@ -0,0 +1,69 @@
# Chia Block Format
![Chia Blockchain](/docs/assets/block-format.png "Chia block format")
## Trunk and Foliage
Chia's blockchain is based on a trunk and a foliage. The trunk is canonical, and contains proofs of time and proofs of space. The foliage is not canonical, and contains the rest of the block header, block body, and transaction filter. Arrows in the diagram represent hash pointers, a hash of the data pointed to.
Light clients can download the trunk chain and the headers, and only download the body for blocks they are interested in.
## Canonical
The reason why the blockchain is separated into a trunk and a foliage chain is that if the contents of blocks affected the proofs of space for the next block, a computationally powerful attacker could grind by creating many block bodies and seeing which one results in the best proof of space. This would make the consensus algorithm very similar to proof of work.
Since proofs of space depend only on the previous block's proof of space and proof of time, a farmer get's only one proof attempt per block. Technically, the difficulty resets affect the number of iterations, and thus affect the trunk as well. That is why there is a delay in the block number at which difficulty resets come into play.
## Double signing
One of the results of this separation into two chains is that the foliage block can be rewritten.
The farmer that signed to foliage block can also sign an alternative block at the same height, with the
same key.
This problem can be solved by allowing the next block's farmer to submit a proof of the double signature (fraud proof),
which steals the rewards from the previous farmer.
While in the short term, double signatures can happen, clients can just wait for more confirmations, and as long as
one farmer did not double sign a block, such a deep reorg cannot happen.
## Formats
### [Header](/src/types/header.py)
* **header_data**: the contents of the block header.
* **harvester_signature**: BLS prepend signature by the plot public key. A prepend signature is a signature of a message with the the pk prepended to the message.
### [Header data](/src/types/header.py)
* **prev_header_hash** :The hash of the header of the previous block.
* **timestamp**: Unix timestamp of block creation time.
* **filter_hash**: Hash of the transaction filter.
* **body_hash**: Hash of the body.
* **extension data**: hash of any extension data or extension block, useful for future updates.
### [Body](/src/types/body.py)
* **coinbase**: this is the transaction which pays out to the pool.
* **coinbase_signature**: signature by the pool public key.
* **fees_target_info**: this is where the fees will be paid out to (usually the farmer's public key), as well as the fee amount.
* **aggregated_signature**: aggregated BLS signature of all signatures in all transactions of this block.
* **solutions generator**: includes all spends in this block.
* **cost**: the cost of all puzzles.
### [Proof of Time](/src/types/proof_of_time.py)
* **challenge_hash**: the hash of the challenge, used to generate VDF group.
* **number_of_iterations**: the number of iterations that the VDF has to go through.
* **output**: the output of the VDF.
* **witness_type**: proof type of VDF.
* **witness**: VDF proof in bytes.
### [Proof of Space](/src/types/proof_of_space.py)
* **challenge_hash**: the hash of the challenge.
* **pool_pubkey**: public key of the pool, this key's signature is required to spend the coinbase transaction. The pool public key in included in the plot seed, and thus must be chosen before the plotting process. Farmers can solo-farm and use their own key as the pool key.
* **plot_pubkey**: public key of the plotter. This key signs the header, and thus allows the owner of the plot to choose their own blocks, as opposed to pools doing this.
* **size**: sometimes referred to as k, this is the plot size parameter.
* **proof**: proof of space of size k*64 bits.
### [Challenge](/src/types/challenge.py)
* **prev_challenge_hash**: the hash of the previous challenge.
* **proof_of_space_hash**: hash of the proof of space.
* **proof_of_time_output_hash**: hash of the proof of time output.
* **height**: height of the block the block.
* **total_weight**: cumulative difficulty of all blocks from genesis, including this one.
* **total_iters** cumulative VDF iterations from genesis, including this one.

View File

@ -0,0 +1,79 @@
# Chia Network Architecture
![Chia Architecture](/docs/assets/chia-architecture.png "Chia architecture")
## Full Nodes
The core of the system is composed of full nodes. Full nodes have several responsibilities:
1. Maintain a copy of the blockchain
2. Validate the blockchain
3. Propagate new blocks, transactions, and proofs through the network, through the peer protocol
4. (Optional) Serve light clients (wallets) through the wallet protocol
5. (Optional) Communicate with farmers and timelords
Full nodes earn no rewards or fees, but they are important to maintain the consensus rules
and the security of the system. Running a full node allows a user to be confident about the
full state of the blockchain, and avoid trusting others.
Full nodes are always connected to another random set of full nodes in the network.
## Farmers
Chia's farmers are analogous to Bitcoin's miners. They earn block rewards and fees by trying to
create valid blocks before anyone else. Farmers don't maintain a copy of the blockchain, but they trust a full node to provide updates.
Farmers communicate with harvesters (individual machines that actually store the plots) through the harvester protocol.
The full node and the farmer communicate through the farmer protocol.
Users who want to solo farm can run the farmer, harvester and full node on the same machine.
Farmers operate by waiting for updates from a full node, which give them new challenge_hashes every time a new block is created.
Farmers then ask all harvesters for proof of space qualities. These qualities, based on the iterations formula, result in an expected block time.
The farmer can choose to fetch the full proofs of space, for those proofs which are expected to finish soon, from
the harvesters.
the full proofs can then be propagated to the full nodes, or sent to a pool as partials.
## Harvesters
Harvesters are individual machines controlled by a farmer.
In a large farming operation, a farmer may be connected to many harvesters.
Harvesters control the actual plot files by retrieving qualities or proofs from disk.
Each plot file corresponds to one plot, and for each random 32 byte challenge, there is an expected
value of one proof of space (although sometimes there are zero or more than one).
On standard HDD drives, fetching a quality will take around 8 random disk seeks, or up to 50ms, whereas fetching a proof will take around 64 disk seeks, or up to 500ms.
For most challenges, qualities will be very low, so fetching the entire proof is not necessary.
There is an upper limit of number of plots for each drive, since fetching the qualities takes time.
However, since there is a constant factor in the iterations formula (each block must have a proof of time of at least around 30 seconds), disk IO times should not be a problem.
Finally, harvesters also maintain a private key for each plot.
This private key is what actually signs the block, allowing farmers/harvesters (as opposed to pools) to actually control the contents of a block.
## Timelords
Timelords support the network by creating sequential proofs of time (using Verifiable Delay Functions) on top on unfinished blocks.
Since this computation is sequential, very little energy is consumed, as opposed to proof of work systems where computation is parallelizable.
Timelords are also connected to full nodes.
Although timelords earn no rewards, there only needs to be one honest timelord online for the blockchain to move forward.
Someone who has a faster timelord can also earn more rewards from their space, since their blocks will finish slightly faster that those of other farmers.
Furthermore, an attacker with a much faster timelord can potentially 51% attack the network with less than 51% of the space, which is why open designs of VDF hardware are very important for the security of the blockchain.
## Pools
Pools allow farmers to smooth out their rewards by earning based on proof of space partials, as opposed to winning blocks.
Pool public keys must be embedded into the plots themselves, so a pool cannot be changed unless the entire plot is recreated.
Pools create and spend **coinbase transactions**, but in Chia's pool protocol they do not actually choose the contents of blocks.
This gives more power to farmers and thus decreases the influence of centralized pools.
Farmers periodically send partials, which contain a proof of space and a signature, to pools.
## Wallets
TODO: (matt)

View File

@ -0,0 +1,40 @@
# Networking and Serialization
## Asynchronous
## CBOR serialization
CBOR is a serialization format (Concise Binary Object Representation, RFC 7049), which optimizes for
small code size and small message size.
## Streamable Format
Chia hashes objects using the simple streamable format.
The primitives are:
* Sized ints serialized in in big endian format, i.e uint64
* Sized bytes serialized in big endian format, i.e bytes32
* BLSPublic keys serialized in bls format
* BLSSignatures serialized in bls format
An item is one of:
* streamable
* primitive
* List[item]
* Optional[item]
A streamable is an ordered group of items.
1. An streamable with fields 1..n is serialized by appending the serialization of each field.
2. A List is serialized into a 4 byte size prefix (number of items) and the serialization of each item
3. An Optional is serialized into a 1 byte prefix of 0x00 or 0x01, and if it's one, it's followed by the serialization of the item
This format can be implemented very easily, and allows us to hash objects like headers and proofs of space,
without complex serialization logic.
Most objects in the Chia protocol are stored and trasmitted using the streamable format.
## Handshake
## Ping Pong
## Introducer

0
docs/5-protocols.md Normal file
View File

View File

20
docs/7-glossary.md Normal file
View File

@ -0,0 +1,20 @@
full node
farmer
harvester
timelord
wallet
pool
grinding attack
difficulty
verifiable delay function
proof of time
proof of space
classgroup
challenge
header
block
headerblock
trunk
foliage
quality string
quality

17
docs/README.md Normal file
View File

@ -0,0 +1,17 @@
Chia Blockchain Documentation
The following series of documents describes the Chia Network Trunk (or consensus layer),
which is separate from the Foliage layer, which deals with coins, scripting,
and mempools.
Familiarity with either the Bitcoin or Ethereum protocols is assumed for this documentation.
The codebase is written in python, with several performance sensitive components (signatures, proof of space,
and proof of time), written in C++.
1. Consensus algorithm summary
2. Block format
3. Chia network architecture
4. Networking and Serialization
5. Protocols
6. Codebase and Testing
7. Glossary

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB