Skip to content

Implement BIP 340-342 validation (Schnorr/taproot/tapscript)#17977

Closed
sipa wants to merge 13 commits intobitcoin:masterfrom
sipa:taproot
Closed

Implement BIP 340-342 validation (Schnorr/taproot/tapscript)#17977
sipa wants to merge 13 commits intobitcoin:masterfrom
sipa:taproot

Conversation

@sipa
Copy link
Member

@sipa sipa commented Jan 21, 2020

This is an implementation of the Schnorr/taproot consensus rules proposed by BIPs 340, 341, and 342.

It consists of:

This does not include any wallet support.

Related PRs and PRs that were extracted from this and submitted separately: #18002 #16902 #18388 #18401 #18422 #18675 #19228

Dependencies:

TODO:

  • Tests for pre-activation (verify that consensus behavior doesn't change until flag is enabled)
  • Extract small & fast BIP341/BIP342 test vectors with good coverage out of the programmatic & slow feature_taproot.py test

@sipa sipa force-pushed the taproot branch 4 times, most recently from c9b9958 to 7b11d12 Compare January 21, 2020 23:44
@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 21, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@sipa sipa force-pushed the taproot branch 2 times, most recently from 4ba9fcf to 1f499c5 Compare January 22, 2020 04:45
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the BIPs have now been assigned numbers, this changes...:

  • BIP-Schnorr to BIP-340
  • BIP-Taproot to BIP-341
  • BIP-Tapscript to BIP-342

@sipa
Copy link
Member Author

sipa commented Jan 22, 2020

@MaxHillebrand A few overall comments:

  • I don't think all references to (bip-)taproot/tapscript/schnorr should be changed to the BIP numbers; in some cases maybe we should just drop the "bip-" prefix (e.g. I think talking about a "taproot spend" is more clear than "bip341 spend").
  • All changes in the src/secp256k1 directory should go to Add schnorrsig module which implements BIP-340 compliant signatures bitcoin-core/secp256k1#558 instead (the src/secp256k1 is a git subtree imported from there).
  • The "BIPSchnorr" and "BIPSchnorrDerive" tagged hash tags are part of the spec, which I don't think should be changed.

@ghost
Copy link

ghost commented Jan 22, 2020

Thanks @sipa, I agree with your comments.
I have deleted my suggestions to change the tagged hashes, the others are still open. Please ACK/NACK and commit what you think is correct.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Taproot, this unifies the upper case Tapscript consistently in the comments.

Copy link
Member

@luke-jr luke-jr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without having fully read the latest BIPs, some code review.

Did not look at tests (final 2 commits).

PrecomputedTransactionData::PrecomputedTransactionData(const T& txTo)
{
Init(txTo);
Init(txTo, {});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably require spent_outputs in the constructor. That gets us a compile-time error rather than a runtime surprise, if any code is missing an update.

Aside from tests (which have a mere 4 uses), it's only used in libbitcoinconsensus, where the omission is arguably a bug (impossible to use existing APIs for Taproot verification).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would very much like that, but I think we inevitably need a few states in it at least.

The PrecomputedTransactionData objects used for block validation are constructed once up front, and then initialized simultaneously with other checks in CheckInputScripts. This means they need some sort of "uninitialized" state which can detect incorrect use before initialization. A cleaner solution would be to move the construction of these objects out of CheckInputScripts and do it up front, so they can be constructed with all available information once and for all. I think that makes sense, but it's a bigger change to validation logic that I'd like to avoid in this PR.

The other reason is indeed libconsensus, which has no API for taproot validation. There is also no problem, as it doesn't expose a taproot validation flag (yet). A future extension to it would add a new call that lets you pass in the spent UTXOs, and let you enable taproot validation. But as of right now, changing things to require the UTXOs whenever initializing the PrecomputedTransactionData would mean that libconsensus would need to construct mock UTXOs and pass those in... which I think is worse than having an explicit uninitialized state that'd trigger an error on misuse.

if (m_spent_outputs_ready && m_spent_outputs[inpos].scriptPubKey.size() == 2 + WITNESS_V1_TAPROOT_SIZE &&
m_spent_outputs[inpos].scriptPubKey[0] == OP_1) {
// Treat every native witness v1 spend as a Taproot spend. This only works if spent_outputs was
// provided as well, but if it wasn't, actual validation will fail anyway.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Review TODO: Check that this is actually true, even when no signature checking occurs.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no signature checking occurs none of the code in interpreter.cpp matters.

fi

ac_configure_args="${ac_configure_args} --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery"
ac_configure_args="${ac_configure_args} --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --enable-module-schnorrsig --enable-experimental"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be non-experimental before merging?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so: It probably shouldn't be marked non-experimental until after it deployed for activation in Bitcoin because it wouldn't be good to encourage third party users of it while it is still easy to make incompatible changes in Bitcoin (e.g. as was just done. :) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, maybe more likely: the API in libsecp256k1 may change still significantly in the near future (e.g. support for variable-length messages, batch validation, ...) even after we treat the scheme itself final.

Comment on lines +425 to +426
// Key path spending in Taproot has no script, so this is unreachable.
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer assert(false); for unreachable conditions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case will cause the assert(false) below to be triggered. Is that sufficient?

case OP_CHECKMULTISIG:
case OP_CHECKMULTISIGVERIFY:
{
if (sigversion == SigVersion::TAPSCRIPT) return set_error(serror, SCRIPT_ERR_DISABLED_OPCODE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All existing disabled opcodes trigger SCRIPT_ERR_DISABLED_OPCODE unconditionally, even if their branch isn't taken. Do we want to preserve that for CHECKMULTISIG? If not, do we still want to use the same error code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, I didn't realize there was a discrepancy here. As BIP342 explicitly states that CMS/CMSV behave as OP_RETURN, namely failing the script if executed, I'm not going to change semantics, but I've changed it to have a separate dedicated error code.

@sipa
Copy link
Member Author

sipa commented Aug 28, 2020

Rebased, updated to latest bitcoin-core/secp256k1#558 (including the even-R BIP change), and addressed a number of @luke-jr's comments above.

I made one additional somewhat significant change on top, the signature/pubkey is now passed into validation routines using a Span (which means one less vector copy per script). A change like this would also make sense for ECDSA, but isn't done here yet.

So far I've been eagerly rewriting commits to address comments without good overview of the code history, but as the code stabilizes, I think it makes sense to stop doing so. My idea is that at some point (perhaps as soon as the libsecp256k1 changes are merged), I'll close this PR, and open 2 different new ones - one with just fixups, and one with the final squashed state (like #7910 and #8149 for segwit). Thoughts?

@jnewbery
Copy link
Contributor

My idea is that at some point (perhaps as soon as the libsecp256k1 changes are merged), I'll close this PR, and open 2 different new ones - one with just fixups, and one with the final squashed state (like #7910 and #8149 for segwit). Thoughts?

Sounds good to me!

[aside: I often get unicorns when loading this PR on mobile, so starting afresh would be useful if only to make interacting with the PR more manageable]

Copy link
Member

@instagibbs instagibbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on proposed PR strategy.

Did light review of the rebase.

Copy link
Contributor

@fjahr fjahr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions from reviewing a6ca508.

I don't see an issue with the proposed strategy if it doesn't create too much more work for you maintaining both. As the code stabilizes reviewing changes with range diffs might be bearable as well and I am not sure what I would use the history for. But I also wasn't reviewing Segwit so I can't speak from experience.

src/pubkey.cpp Outdated
return secp256k1_schnorrsig_verify(secp256k1_context_verify, sigbytes.data(), msg.begin(), &pubkey);
}

bool XOnlyPubKey::CheckPayToContract(const XOnlyPubKey& base, const uint256& hash, bool negated) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool XOnlyPubKey::CheckPayToContract(const XOnlyPubKey& base, const uint256& hash, bool negated) const
bool XOnlyPubKey::CheckPayToContract(const XOnlyPubKey& base, const uint256& hash, const bool negated) const

Copy link

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I've reviewed most of the code and it's very good.
The biggest mistake IMO is using BIP9 instead of BIP8. I wouldn't be surprised at all if there's opposition to Taproot due to its privacy nature.

I've written down a detailed review for the curious.

static bool VerifyWitnessProgram(const CScriptWitness& witness, int witversion, const std::vector<unsigned char>& program, unsigned int flags, const BaseSignatureChecker& checker, ScriptError* serror)
static bool VerifyTaprootCommitment(const std::vector<unsigned char>& control, const std::vector<unsigned char>& program, const CScript& script, uint256* tapleaf_hash)
{
int path_len = (control.size() - TAPROOT_CONTROL_BASE_SIZE) / TAPROOT_CONTROL_NODE_SIZE;

This comment was marked as resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it out so that invalid control size can be its own error code. It's sanity checked before this function is invoked.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still have to review new hash algorithm/opcodes evaluation. So far issues sound to be test holes, see #17977 (comment)

// BIP341 Taproot: 32-byte non-P2SH witness v1 program (which encodes a P2C-tweaked pubkey)
if (!(flags & SCRIPT_VERIFY_TAPROOT)) return set_success(serror);
if (stack.size() == 0) return set_error(serror, SCRIPT_ERR_WITNESS_PROGRAM_WITNESS_EMPTY);
if (stack.size() >= 2 && !stack.back().empty() && stack.back()[0] == ANNEX_TAG) {
Copy link

@ariard ariard Sep 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we outlaw witness-with-annex from our policy ?

EDIT: This is part of it as 7548827. Maybe add a reference that's is outlawed in IsWitnessStandard. A side-question, why OP_SUCESSx and unknown public key types aren't rejected there too to gather all witness sanitization check in one place ?

Copy link
Member Author

@sipa sipa Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was some earlier discussion about this (github's webinterface is starting to suffer from the obesity of the discussion here, so I can't find the link).

I'm not sure what the best strategy is here:

  • Keeping things solely in the interpreter avoids duplicate work/complexity of detecting what type of spending is used, but means extra script validation flags.
  • Putting things in IsWitnessStandard may be more efficient, as it runs before any script validation is executed (and can be bypassed on testnet, which may or may not be desirable).

I'm currrently following a "things that don't need significant duplication of work go in IsWitnessStandard, the rest is done using script validation flags" rule, but I admit it's fairly arbitrary.

Going to add a comment here to note that annexes are nonstandard through other code.

Copy link

@ariard ariard Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally lean towards a clean separation between policy and consensus. You need to duplicate detection anyway both at relay and validation, what we may avoid I hope is having twice the same parsing logic. Parsing can be unique and interpretation according to consensus or policy ?

Anyway, it sounds costly in refactoring, I think that's a wider discussion to have so opened #19875 (pinning previous PR discussion on this matter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ariard Yes, I agree conceptually with better separation between consensus and policy implementation of script (I myself suggested splitting CScript in a minimal consensus implementation, and a more featureful separate one for everything else - policy and more), but I don't think that's relevant to the discussion here.

Even if we have a separate script interpreter for consensus and one for policy, we'll still want a separate IsStandard* function, which can do policy checks before any script interpreter (which may be expensive) is run.

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #17977 (comment) and #17977 (comment) which are overall worthy of attention.

(done for now and ACK on merging strategy)

return set_error(serror, SCRIPT_ERR_UNBALANCED_CONDITIONAL);
valtype& vch = stacktop(-1);
if (sigversion == SigVersion::WITNESS_V0 && (flags & SCRIPT_VERIFY_MINIMALIF)) {
if (sigversion == SigVersion::TAPSCRIPT || (sigversion == SigVersion::WITNESS_V0 && (flags & SCRIPT_VERIFY_MINIMALIF))) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think melting a consensus check with a policy one in the same conditional is likely source to confusion for future reviewers. I'm quite sure that's already something done elsewhere but maybe we could do better by adding a GetScriptVersionFlags(SigVersion) called at the top of EvalScript.

This new function should set each consensus-compliant interpreter flags per-type of script evaluated thus avoiding setting flags for old scripts types if we would do this in GetBlockScriptFlags while easily overloading flag setting for future types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of changing script validation flags based on context. It might be ok here, but in general that gets complex very quickly - things could even start interacting.

I've just added a comment to explain this. What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on how you define context, does it include sigversion parameter ? BIP342 introduction presents well the script rules hierarchy and that would be better if we could encode it in clean code paths, a "flat" flag matrix is obviously harder to reason on.

I think added comment is good enough for now, I guess this conversation belong to the previous, wider one on refactoring/duplicating interpreter before post-taproot script softforks.

nit: SCRIPT_VERIFY_MINIMALIF you forgot the IF in new comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SCRIPT_VERIFY_MINIMALIF you forgot the IF in new comment.

Fixed.


// Data about the output(s)
if (output_type == SIGHASH_SINGLE) {
if (in_pos >= tx_to.vout.size()) return false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get feature_taproot.py failure for this :

diff --git a/src/script/interpreter.cpp b/src/script/interpreter.cpp
index 053df47a2..eb3847250 100644
--- a/src/script/interpreter.cpp
+++ b/src/script/interpreter.cpp
@@ -1528,7 +1528,7 @@ bool SignatureHashSchnorr(uint256& hash_out, const ScriptExecutionData& execdata

     // Data about the output(s)
     if (output_type == SIGHASH_SINGLE) {
-        if (in_pos >= tx_to.vout.size()) return false;
+        if (in_pos >= tx_to.vout.size()) return true;
         CHashWriter sha_single_output(SER_GETHASH, 0);
         sha_single_output << tx_to.vout[in_pos];
         ss << sha_single_output.GetSHA256();

Also I think this is check can be moved above at hash_type validation as every predicate is already known and we would save on hashing by failing faster ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, this one was harder to address. I've added support in the test framework for specifying that a particular spending test requires a mismatching output (so the number of outputs <= input position), shuffling inputs around to accomplish that.

@gmaxwell
Copy link
Contributor

gmaxwell commented Sep 4, 2020

The biggest mistake IMO is using BIP9 instead of BIP8.

This PR doesn't even have activation at all: it is set to always active in regtest and not at all otherwise. The only connection to bip9 is that the code that handles BIP9 is the codebases' generic code for handling consensus rules which are active in some places and not others.

The way this works (and the way it has generally worked for prior consensus changes)-- is that the functionality is first merged disabled (except for in testing modes) and then later the activation gets done. It's difficult to test things completely without establishing them in the codebase, and people can't really decide to activate a consensus change that isn't done.

So please don't bring activation discussion to this PR, it's offtopic unless there is some specific technical property of the implementation or specification that would have some particular ramification for activation: e.g. if it was implemented in a way that prevented the new rules from being activated at a particular block but not active before then that would be worth pointing out.

@sipa
Copy link
Member Author

sipa commented Sep 4, 2020

Rebased, updated libsecp256k1 to latest bitcoin-core/secp256k1#558, and addressed a number of comments by @fjahr, @Kixunil, and @ariard. Thanks!

@Kixunil
Copy link

Kixunil commented Sep 7, 2020

@gmaxwell thanks for clarifying, will check activation discussion at whatever is the appropriate place.

@sipa
Copy link
Member Author

sipa commented Sep 11, 2020

Rebased on top of #19944.

sipa and others added 13 commits September 13, 2020 21:05
A BIP-341 signature message may commit to the scriptPubKeys and amounts
of all spent outputs (including other ones than the input being signed
for spends), so keep them available to signature hashing code.
Includes changes to PrecomputedTransactionData by Pieter Wuille.
The old name is confusing, as it doesn't store a scriptPubKey, but the
actually executed script.
This includes key path spending and script path spending, but not the
Tapscript execution implementation.

Includes constants for various aspects of the consensus rules suggested
by Jeremy Rubin.
Includes sighashing code and many tests by Johnson Lau.
Includes a test by Matthew Zipkin.
Includes several tests and improvements by Greg Sanders.
@sipa
Copy link
Member Author

sipa commented Sep 14, 2020

Closing in favor of new PR(s), as discussed here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.