Why does a thin node need to know if a transaction is present in specific block's merkle tree?

I understand what a merkle root is and how merkle proofs work.

It's all about thin nodes checking whether a specific transaction ID is in a specific block. So, thin node says: I have a transaction ID - 12345 and I want to know if this transaction is present in 259,291th block.

Why would a thin node want to know information like this ?

Answers 2

  • Why would a thin node want to know information like this ?

    They quite possibly don't care about exactly which block the transaction was included in, but they most certainly care about knowing the transaction was included in the blockchain at all. It just so happens that there is no easy way to prove the latter without proving which block it is included in.

    To illustrate a more concrete interaction (as used in BIP37, but other protocols like Electrum's client-server protocol are similar). F = full node (the server), L = light client. This is obviously not an actual protocol dump, but conceptually it's close.

    L: Hi F, I'm a light client and am interested in transactions involving addresses A, B, and C. [filteradd]

    F: Okay. [nothing]

    L: I have synchronized blocks up to block hash H. [getheaders]

    F: Oh dear, you're behind, there is also block I and J. Here are their headers! [headers]

    L: Those headers look great, and their proof of work is valid. So eh... does I contain any transactions I care about? [getdata with MSG_FILTERED_BLOCK]

    F: Nope. [empty merkleblock]

    L: Okay... what about block J [getdata with MSG_FILTERED_BLOCK]

    F: Oh yes, here is transaction T that involves one of your addresses! [merkleblock]

    L: That looks like a transaction that indeed pays me... but can you prove it's actually been included in the chain? You know, I don't trust you really and you could be giving me an invalid transaction, or one that spends money that has already moved in the chain. [nothing]

    F: Fine, here is a Merkle path that proves T is in fact included in block J. [part of the earlier merkleblock message, actually].

    So the point is that without the proof, F could be claiming that L got paid, while the transaction is invalid. Having it included in the chain makes that very expensive, assuming other nodes in the network actually enforce validity of transactions in the chain.

    There is a caveat of course. There is no way that F can prove to L that block I didn't include any interesting transactions. The solution to that is either trusting the server, or trying multiple - neither is very satisfying.

  • TL;DR: A Merkle branch provides costly-to-fake indicator that a transaction is actually part of the blockchain. The cost of forging a Merkle branch is sufficiently high to deter some attacks on low-value transactions, but trivial to validate by the thin client.

    Full nodes and thin clients are distinguished by whether they independently process the whole blockchain. While full nodes verify the complete blockchain from the Genesis block to the chain-tip, thin clients usually keep only copies of transactions they are interested in, and sometimes the headers of the blockchain. Hereby, the headers of the blockchain are already a huge security boost, because the block headers by themselves already are subject to a number of consensus rules including the difficulty requirement. Creating fake block headers that pass even a cursory validation is very costly.

    One of the fields included in the 80-byte block header is the Merkle root. The Merkle root is a cryptographic commitment to all transactions included in the block. So, providing the Merkle branch that connects the transaction's leaf to the root is a strong indicator that the transaction existed in a block. Especially, if the block is checked to pass the difficulty requirement which is computationally expensive to find. If the thin client tracks the chain of block headers, the headers tie together by linking to their predecessors, which means that an attacker not only needs to find a block with sufficient proof-of-work, but also do so in a limited time.

    If the transaction were not in a block, it would be computationally infeasible to find hashing partners that tie the transaction's hash to a specific Merkle root. Unfortunately, the thin client could still be lied to; either with the claim that the transaction is not included in a block (it's impossible to proof absence except by parsing all blocks), or if the transaction was part of a stale chain-tip that got superseded by a chain-tip where the transaction was not included. Thin clients usually protect against these by asking multiple peers for information, and by waiting for multiple confirmations to make attacks more costly.

Related Questions