Source File
field.go
Belonging Package
github.com/decred/dcrd/dcrec/secp256k1/v4
// Copyright (c) 2013-2014 The btcsuite developers// Copyright (c) 2015-2024 The Decred developers// Copyright (c) 2013-2024 Dave Collins// Use of this source code is governed by an ISC// license that can be found in the LICENSE file.package secp256k1// References:// [HAC]: Handbook of Applied Cryptography Menezes, van Oorschot, Vanstone.// http://cacr.uwaterloo.ca/hac/// All elliptic curve operations for secp256k1 are done in a finite field// characterized by a 256-bit prime. Given this precision is larger than the// biggest available native type, obviously some form of bignum math is needed.// This package implements specialized fixed-precision field arithmetic rather// than relying on an arbitrary-precision arithmetic package such as math/big// for dealing with the field math since the size is known. As a result, rather// large performance gains are achieved by taking advantage of many// optimizations not available to arbitrary-precision arithmetic and generic// modular arithmetic algorithms.//// There are various ways to internally represent each finite field element.// For example, the most obvious representation would be to use an array of 4// uint64s (64 bits * 4 = 256 bits). However, that representation suffers from// a couple of issues. First, there is no native Go type large enough to handle// the intermediate results while adding or multiplying two 64-bit numbers, and// second there is no space left for overflows when performing the intermediate// arithmetic between each array element which would lead to expensive carry// propagation.//// Given the above, this implementation represents the field elements as// 10 uint32s with each word (array entry) treated as base 2^26. This was// chosen for the following reasons:// 1) Most systems at the current time are 64-bit (or at least have 64-bit// registers available for specialized purposes such as MMX) so the// intermediate results can typically be done using a native register (and// using uint64s to avoid the need for additional half-word arithmetic)// 2) In order to allow addition of the internal words without having to// propagate the carry, the max normalized value for each register must// be less than the number of bits available in the register// 3) Since we're dealing with 32-bit values, 64-bits of overflow is a// reasonable choice for #2// 4) Given the need for 256-bits of precision and the properties stated in #1,// #2, and #3, the representation which best accommodates this is 10 uint32s// with base 2^26 (26 bits * 10 = 260 bits, so the final word only needs 22// bits) which leaves the desired 64 bits (32 * 10 = 320, 320 - 256 = 64) for// overflow//// Since it is so important that the field arithmetic is extremely fast for high// performance crypto, this type does not perform any validation where it// ordinarily would. See the documentation for FieldVal for more details.import ()// Constants used to make the code more readable.const (twoBitsMask = 0x3fourBitsMask = 0xfsixBitsMask = 0x3feightBitsMask = 0xff)// Constants related to the field representation.const (// fieldWords is the number of words used to internally represent the// 256-bit value.fieldWords = 10// fieldBase is the exponent used to form the numeric base of each word.// 2^(fieldBase*i) where i is the word position.fieldBase = 26// fieldBaseMask is the mask for the bits in each word needed to// represent the numeric base of each word (except the most significant// word).fieldBaseMask = (1 << fieldBase) - 1// fieldMSBBits is the number of bits in the most significant word used// to represent the value.fieldMSBBits = 256 - (fieldBase * (fieldWords - 1))// fieldMSBMask is the mask for the bits in the most significant word// needed to represent the value.fieldMSBMask = (1 << fieldMSBBits) - 1// These fields provide convenient access to each of the words of the// secp256k1 prime in the internal field representation to improve code// readability.fieldPrimeWordZero = 0x03fffc2ffieldPrimeWordOne = 0x03ffffbffieldPrimeWordTwo = 0x03fffffffieldPrimeWordThree = 0x03fffffffieldPrimeWordFour = 0x03fffffffieldPrimeWordFive = 0x03fffffffieldPrimeWordSix = 0x03fffffffieldPrimeWordSeven = 0x03fffffffieldPrimeWordEight = 0x03fffffffieldPrimeWordNine = 0x003fffff)// FieldVal implements optimized fixed-precision arithmetic over the// secp256k1 finite field. This means all arithmetic is performed modulo//// 0xfffffffffffffffffffffffffffffffffffffffffffffffffffffffefffffc2f.//// WARNING: Since it is so important for the field arithmetic to be extremely// fast for high performance crypto, this type does not perform any validation// of documented preconditions where it ordinarily would. As a result, it is// IMPERATIVE for callers to understand some key concepts that are described// below and ensure the methods are called with the necessary preconditions that// each method is documented with. For example, some methods only give the// correct result if the field value is normalized and others require the field// values involved to have a maximum magnitude and THERE ARE NO EXPLICIT CHECKS// TO ENSURE THOSE PRECONDITIONS ARE SATISFIED. This does, unfortunately, make// the type more difficult to use correctly and while I typically prefer to// ensure all state and input is valid for most code, this is a bit of an// exception because those extra checks really add up in what ends up being// critical hot paths.//// The first key concept when working with this type is normalization. In order// to avoid the need to propagate a ton of carries, the internal representation// provides additional overflow bits for each word of the overall 256-bit value.// This means that there are multiple internal representations for the same// value and, as a result, any methods that rely on comparison of the value,// such as equality and oddness determination, require the caller to provide a// normalized value.//// The second key concept when working with this type is magnitude. As// previously mentioned, the internal representation provides additional// overflow bits which means that the more math operations that are performed on// the field value between normalizations, the more those overflow bits// accumulate. The magnitude is effectively that maximum possible number of// those overflow bits that could possibly be required as a result of a given// operation. Since there are only a limited number of overflow bits available,// this implies that the max possible magnitude MUST be tracked by the caller// and the caller MUST normalize the field value if a given operation would// cause the magnitude of the result to exceed the max allowed value.//// IMPORTANT: The max allowed magnitude of a field value is 32.type FieldVal struct {// Each 256-bit value is represented as 10 32-bit integers in base 2^26.// This provides 6 bits of overflow in each word (10 bits in the most// significant word) for a total of 64 bits of overflow (9*6 + 10 = 64). It// only implements the arithmetic needed for elliptic curve operations.//// The following depicts the internal representation:// -----------------------------------------------------------------// | n[9] | n[8] | ... | n[0] |// | 32 bits available | 32 bits available | ... | 32 bits available |// | 22 bits for value | 26 bits for value | ... | 26 bits for value |// | 10 bits overflow | 6 bits overflow | ... | 6 bits overflow |// | Mult: 2^(26*9) | Mult: 2^(26*8) | ... | Mult: 2^(26*0) |// -----------------------------------------------------------------//// For example, consider the number 2^49 + 1. It would be represented as:// n[0] = 1// n[1] = 2^23// n[2..9] = 0//// The full 256-bit value is then calculated by looping i from 9..0 and// doing sum(n[i] * 2^(26i)) like so:// n[9] * 2^(26*9) = 0 * 2^234 = 0// n[8] * 2^(26*8) = 0 * 2^208 = 0// ...// n[1] * 2^(26*1) = 2^23 * 2^26 = 2^49// n[0] * 2^(26*0) = 1 * 2^0 = 1// Sum: 0 + 0 + ... + 2^49 + 1 = 2^49 + 1n [10]uint32}// String returns the field value as a normalized human-readable hex string.//// Preconditions: None// Output Normalized: Field is not modified -- same as input value// Output Max Magnitude: Field is not modified -- same as input valuefunc ( FieldVal) () string {// f is a copy, so it's safe to normalize it without mutating the original..Normalize()return hex.EncodeToString(.Bytes()[:])}// Zero sets the field value to zero in constant time. A newly created field// value is already set to zero. This function can be useful to clear an// existing field value for reuse.//// Preconditions: None// Output Normalized: Yes// Output Max Magnitude: 1func ( *FieldVal) () {.n[0] = 0.n[1] = 0.n[2] = 0.n[3] = 0.n[4] = 0.n[5] = 0.n[6] = 0.n[7] = 0.n[8] = 0.n[9] = 0}// Set sets the field value equal to the passed value in constant time. The// normalization and magnitude of the two fields will be identical.//// The field value is returned to support chaining. This enables syntax like:// f := new(FieldVal).Set(f2).Add(1) so that f = f2 + 1 where f2 is not// modified.//// Preconditions: None// Output Normalized: Same as input value// Output Max Magnitude: Same as input valuefunc ( *FieldVal) ( *FieldVal) *FieldVal {* = *return}// SetInt sets the field value to the passed integer in constant time. This is// a convenience function since it is fairly common to perform some arithmetic// with small native integers.//// The field value is returned to support chaining. This enables syntax such// as f := new(FieldVal).SetInt(2).Mul(f2) so that f = 2 * f2.//// Preconditions: None// Output Normalized: Yes// Output Max Magnitude: 1func ( *FieldVal) ( uint16) *FieldVal {.Zero().n[0] = uint32()return}// SetBytes packs the passed 32-byte big-endian value into the internal field// value representation in constant time. SetBytes interprets the provided// array as a 256-bit big-endian unsigned integer, packs it into the internal// field value representation, and returns either 1 if it is greater than or// equal to the field prime (aka it overflowed) or 0 otherwise in constant time.//// Note that a bool is not used here because it is not possible in Go to convert// from a bool to numeric value in constant time and many constant-time// operations require a numeric value.//// Preconditions: None// Output Normalized: Yes if no overflow, no otherwise// Output Max Magnitude: 1func ( *FieldVal) ( *[32]byte) uint32 {// Pack the 256 total bits across the 10 uint32 words with a max of// 26-bits per word. This could be done with a couple of for loops,// but this unrolled version is significantly faster. Benchmarks show// this is about 34 times faster than the variant which uses loops..n[0] = uint32([31]) | uint32([30])<<8 | uint32([29])<<16 |(uint32([28])&twoBitsMask)<<24.n[1] = uint32([28])>>2 | uint32([27])<<6 | uint32([26])<<14 |(uint32([25])&fourBitsMask)<<22.n[2] = uint32([25])>>4 | uint32([24])<<4 | uint32([23])<<12 |(uint32([22])&sixBitsMask)<<20.n[3] = uint32([22])>>6 | uint32([21])<<2 | uint32([20])<<10 |uint32([19])<<18.n[4] = uint32([18]) | uint32([17])<<8 | uint32([16])<<16 |(uint32([15])&twoBitsMask)<<24.n[5] = uint32([15])>>2 | uint32([14])<<6 | uint32([13])<<14 |(uint32([12])&fourBitsMask)<<22.n[6] = uint32([12])>>4 | uint32([11])<<4 | uint32([10])<<12 |(uint32([9])&sixBitsMask)<<20.n[7] = uint32([9])>>6 | uint32([8])<<2 | uint32([7])<<10 |uint32([6])<<18.n[8] = uint32([5]) | uint32([4])<<8 | uint32([3])<<16 |(uint32([2])&twoBitsMask)<<24.n[9] = uint32([2])>>2 | uint32([1])<<6 | uint32([0])<<14// The intuition here is that the field value is greater than the prime if// one of the higher individual words is greater than corresponding word of// the prime and all higher words in the field value are equal to their// corresponding word of the prime. Since this type is modulo the prime,// being equal is also an overflow back to 0.//// Note that because the input is 32 bytes and it was just packed into the// field representation, the only words that can possibly be greater are// zero and one, because ceil(log_2(2^256 - 1 - P)) = 33 bits max and the// internal field representation encodes 26 bits with each word.//// Thus, there is no need to test if the upper words of the field value// exceeds them, hence, only equality is checked for them.:= constantTimeEq(.n[9], fieldPrimeWordNine)&= constantTimeEq(.n[8], fieldPrimeWordEight)&= constantTimeEq(.n[7], fieldPrimeWordSeven)&= constantTimeEq(.n[6], fieldPrimeWordSix)&= constantTimeEq(.n[5], fieldPrimeWordFive)&= constantTimeEq(.n[4], fieldPrimeWordFour)&= constantTimeEq(.n[3], fieldPrimeWordThree)&= constantTimeEq(.n[2], fieldPrimeWordTwo):= & constantTimeGreater(.n[1], fieldPrimeWordOne)&= constantTimeEq(.n[1], fieldPrimeWordOne)|= & constantTimeGreaterOrEq(.n[0], fieldPrimeWordZero)return}// SetByteSlice interprets the provided slice as a 256-bit big-endian unsigned// integer (meaning it is truncated to the first 32 bytes), packs it into the// internal field value representation, and returns whether or not the resulting// truncated 256-bit integer is greater than or equal to the field prime (aka it// overflowed) in constant time.//// Note that since passing a slice with more than 32 bytes is truncated, it is// possible that the truncated value is less than the field prime and hence it// will not be reported as having overflowed in that case. It is up to the// caller to decide whether it needs to provide numbers of the appropriate size// or it if is acceptable to use this function with the described truncation and// overflow behavior.//// Preconditions: None// Output Normalized: Yes if no overflow, no otherwise// Output Max Magnitude: 1func ( *FieldVal) ( []byte) bool {var [32]byte= [:constantTimeMin(uint32(len()), 32)]copy([:], [:32-len()])copy([32-len():], ):= .SetBytes(&)zeroArray32(&)return != 0}// Normalize normalizes the internal field words into the desired range and// performs fast modular reduction over the secp256k1 prime by making use of the// special form of the prime in constant time.//// Preconditions: None// Output Normalized: Yes// Output Max Magnitude: 1func ( *FieldVal) () *FieldVal {// The field representation leaves 6 bits of overflow in each word so// intermediate calculations can be performed without needing to// propagate the carry to each higher word during the calculations. In// order to normalize, we need to "compact" the full 256-bit value to// the right while propagating any carries through to the high order// word.//// Since this field is doing arithmetic modulo the secp256k1 prime, we// also need to perform modular reduction over the prime.//// Per [HAC] section 14.3.4: Reduction method of moduli of special form,// when the modulus is of the special form m = b^t - c, highly efficient// reduction can be achieved.//// The secp256k1 prime is equivalent to 2^256 - 4294968273, so it fits// this criteria.//// 4294968273 in field representation (base 2^26) is:// n[0] = 977// n[1] = 64// That is to say (2^26 * 64) + 977 = 4294968273//// The algorithm presented in the referenced section typically repeats// until the quotient is zero. However, due to our field representation// we already know to within one reduction how many times we would need// to repeat as it's the uppermost bits of the high order word. Thus we// can simply multiply the magnitude by the field representation of the// prime and do a single iteration. After this step there might be an// additional carry to bit 256 (bit 22 of the high order word).:= .n[9]:= >> fieldMSBBits&= fieldMSBMask:= .n[0] + *977:= ( >> fieldBase) + .n[1] + ( << 6)&= fieldBaseMask:= ( >> fieldBase) + .n[2]&= fieldBaseMask:= ( >> fieldBase) + .n[3]&= fieldBaseMask:= ( >> fieldBase) + .n[4]&= fieldBaseMask:= ( >> fieldBase) + .n[5]&= fieldBaseMask:= ( >> fieldBase) + .n[6]&= fieldBaseMask:= ( >> fieldBase) + .n[7]&= fieldBaseMask:= ( >> fieldBase) + .n[8]&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask// At this point, the magnitude is guaranteed to be one, however, the// value could still be greater than the prime if there was either a// carry through to bit 256 (bit 22 of the higher order word) or the// value is greater than or equal to the field characteristic. The// following determines if either or these conditions are true and does// the final reduction in constant time.//// Also note that 'm' will be zero when neither of the aforementioned// conditions are true and the value will not be changed when 'm' is zero.= constantTimeEq(, fieldMSBMask)&= constantTimeEq(&&&&&&, fieldBaseMask)&= constantTimeGreater(+64+((+977)>>fieldBase), fieldBaseMask)|= >> fieldMSBBits+= * 977= ( >> fieldBase) + + ( << 6)&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask= ( >> fieldBase) +&= fieldBaseMask&= fieldMSBMask // Remove potential multiple of 2^256.// Finally, set the normalized and reduced words..n[0] =.n[1] =.n[2] =.n[3] =.n[4] =.n[5] =.n[6] =.n[7] =.n[8] =.n[9] =return}// PutBytesUnchecked unpacks the field value to a 32-byte big-endian value// directly into the passed byte slice in constant time. The target slice must// have at least 32 bytes available or it will panic.//// There is a similar function, PutBytes, which unpacks the field value into a// 32-byte array directly. This version is provided since it can be useful// to write directly into part of a larger buffer without needing a separate// allocation.//// Preconditions:// - The field value MUST be normalized// - The target slice MUST have at least 32 bytes availablefunc ( *FieldVal) ( []byte) {// Unpack the 256 total bits from the 10 uint32 words with a max of// 26-bits per word. This could be done with a couple of for loops,// but this unrolled version is a bit faster. Benchmarks show this is// about 10 times faster than the variant which uses loops.[31] = byte(.n[0] & eightBitsMask)[30] = byte((.n[0] >> 8) & eightBitsMask)[29] = byte((.n[0] >> 16) & eightBitsMask)[28] = byte((.n[0]>>24)&twoBitsMask | (.n[1]&sixBitsMask)<<2)[27] = byte((.n[1] >> 6) & eightBitsMask)[26] = byte((.n[1] >> 14) & eightBitsMask)[25] = byte((.n[1]>>22)&fourBitsMask | (.n[2]&fourBitsMask)<<4)[24] = byte((.n[2] >> 4) & eightBitsMask)[23] = byte((.n[2] >> 12) & eightBitsMask)[22] = byte((.n[2]>>20)&sixBitsMask | (.n[3]&twoBitsMask)<<6)[21] = byte((.n[3] >> 2) & eightBitsMask)[20] = byte((.n[3] >> 10) & eightBitsMask)[19] = byte((.n[3] >> 18) & eightBitsMask)[18] = byte(.n[4] & eightBitsMask)[17] = byte((.n[4] >> 8) & eightBitsMask)[16] = byte((.n[4] >> 16) & eightBitsMask)[15] = byte((.n[4]>>24)&twoBitsMask | (.n[5]&sixBitsMask)<<2)[14] = byte((.n[5] >> 6) & eightBitsMask)[13] = byte((.n[5] >> 14) & eightBitsMask)[12] = byte((.n[5]>>22)&fourBitsMask | (.n[6]&fourBitsMask)<<4)[11] = byte((.n[6] >> 4) & eightBitsMask)[10] = byte((.n[6] >> 12) & eightBitsMask)[9] = byte((.n[6]>>20)&sixBitsMask | (.n[7]&twoBitsMask)<<6)[8] = byte((.n[7] >> 2) & eightBitsMask)[7] = byte((.n[7] >> 10) & eightBitsMask)[6] = byte((.n[7] >> 18) & eightBitsMask)[5] = byte(.n[8] & eightBitsMask)[4] = byte((.n[8] >> 8) & eightBitsMask)[3] = byte((.n[8] >> 16) & eightBitsMask)[2] = byte((.n[8]>>24)&twoBitsMask | (.n[9]&sixBitsMask)<<2)[1] = byte((.n[9] >> 6) & eightBitsMask)[0] = byte((.n[9] >> 14) & eightBitsMask)}// PutBytes unpacks the field value to a 32-byte big-endian value using the// passed byte array in constant time.//// There is a similar function, PutBytesUnchecked, which unpacks the field value// into a slice that must have at least 32 bytes available. This version is// provided since it can be useful to write directly into an array that is type// checked.//// Alternatively, there is also Bytes, which unpacks the field value into a new// array and returns that which can sometimes be more ergonomic in applications// that aren't concerned about an additional copy.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) ( *[32]byte) {.PutBytesUnchecked([:])}// Bytes unpacks the field value to a 32-byte big-endian value in constant time.//// See PutBytes and PutBytesUnchecked for variants that allow an array or slice// to be passed which can be useful to cut down on the number of allocations by// allowing the caller to reuse a buffer or write directly into part of a larger// buffer.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () *[32]byte {:= new([32]byte).PutBytesUnchecked([:])return}// IsZeroBit returns 1 when the field value is equal to zero or 0 otherwise in// constant time.//// Note that a bool is not used here because it is not possible in Go to convert// from a bool to numeric value in constant time and many constant-time// operations require a numeric value. See IsZero for the version that returns// a bool.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () uint32 {// The value can only be zero if no bits are set in any of the words.// This is a constant time implementation.:= .n[0] | .n[1] | .n[2] | .n[3] | .n[4] |.n[5] | .n[6] | .n[7] | .n[8] | .n[9]return constantTimeEq(, 0)}// IsZero returns whether or not the field value is equal to zero in constant// time.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () bool {// The value can only be zero if no bits are set in any of the words.// This is a constant time implementation.:= .n[0] | .n[1] | .n[2] | .n[3] | .n[4] |.n[5] | .n[6] | .n[7] | .n[8] | .n[9]return == 0}// IsOneBit returns 1 when the field value is equal to one or 0 otherwise in// constant time.//// Note that a bool is not used here because it is not possible in Go to convert// from a bool to numeric value in constant time and many constant-time// operations require a numeric value. See IsOne for the version that returns a// bool.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () uint32 {// The value can only be one if the single lowest significant bit is set in// the first word and no other bits are set in any of the other words.// This is a constant time implementation.:= (.n[0] ^ 1) | .n[1] | .n[2] | .n[3] | .n[4] | .n[5] |.n[6] | .n[7] | .n[8] | .n[9]return constantTimeEq(, 0)}// IsOne returns whether or not the field value is equal to one in constant// time.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () bool {// The value can only be one if the single lowest significant bit is set in// the first word and no other bits are set in any of the other words.// This is a constant time implementation.:= (.n[0] ^ 1) | .n[1] | .n[2] | .n[3] | .n[4] | .n[5] |.n[6] | .n[7] | .n[8] | .n[9]return == 0}// IsOddBit returns 1 when the field value is an odd number or 0 otherwise in// constant time.//// Note that a bool is not used here because it is not possible in Go to convert// from a bool to numeric value in constant time and many constant-time// operations require a numeric value. See IsOdd for the version that returns a// bool.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () uint32 {// Only odd numbers have the bottom bit set.return .n[0] & 1}// IsOdd returns whether or not the field value is an odd number in constant// time.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () bool {// Only odd numbers have the bottom bit set.return .n[0]&1 == 1}// Equals returns whether or not the two field values are the same in constant// time.//// Preconditions:// - Both field values being compared MUST be normalizedfunc ( *FieldVal) ( *FieldVal) bool {// Xor only sets bits when they are different, so the two field values// can only be the same if no bits are set after xoring each word.// This is a constant time implementation.:= (.n[0] ^ .n[0]) | (.n[1] ^ .n[1]) | (.n[2] ^ .n[2]) |(.n[3] ^ .n[3]) | (.n[4] ^ .n[4]) | (.n[5] ^ .n[5]) |(.n[6] ^ .n[6]) | (.n[7] ^ .n[7]) | (.n[8] ^ .n[8]) |(.n[9] ^ .n[9])return == 0}// NegateVal negates the passed value and stores the result in f in constant// time. The caller must provide the maximum magnitude of the passed value for// a correct result.//// The field value is returned to support chaining. This enables syntax like:// f.NegateVal(f2).AddInt(1) so that f = -f2 + 1.//// Preconditions:// - The max magnitude MUST be 31// Output Normalized: No// Output Max Magnitude: Input magnitude + 1func ( *FieldVal) ( *FieldVal, uint32) *FieldVal {// Negation in the field is just the prime minus the value. However,// in order to allow negation against a field value without having to// normalize/reduce it first, multiply by the magnitude (that is how// "far" away it is from the normalized value) to adjust. Also, since// negating a value pushes it one more order of magnitude away from the// normalized range, add 1 to compensate.//// For some intuition here, imagine you're performing mod 12 arithmetic// (picture a clock) and you are negating the number 7. So you start at// 12 (which is of course 0 under mod 12) and count backwards (left on// the clock) 7 times to arrive at 5. Notice this is just 12-7 = 5.// Now, assume you're starting with 19, which is a number that is// already larger than the modulus and congruent to 7 (mod 12). When a// value is already in the desired range, its magnitude is 1. Since 19// is an additional "step", its magnitude (mod 12) is 2. Since any// multiple of the modulus is congruent to zero (mod m), the answer can// be shortcut by simply multiplying the magnitude by the modulus and// subtracting. Keeping with the example, this would be (2*12)-19 = 5..n[0] = (+1)*fieldPrimeWordZero - .n[0].n[1] = (+1)*fieldPrimeWordOne - .n[1].n[2] = (+1)*fieldBaseMask - .n[2].n[3] = (+1)*fieldBaseMask - .n[3].n[4] = (+1)*fieldBaseMask - .n[4].n[5] = (+1)*fieldBaseMask - .n[5].n[6] = (+1)*fieldBaseMask - .n[6].n[7] = (+1)*fieldBaseMask - .n[7].n[8] = (+1)*fieldBaseMask - .n[8].n[9] = (+1)*fieldMSBMask - .n[9]return}// Negate negates the field value in constant time. The existing field value is// modified. The caller must provide the maximum magnitude of the field value// for a correct result.//// The field value is returned to support chaining. This enables syntax like:// f.Negate().AddInt(1) so that f = -f + 1.//// Preconditions:// - The max magnitude MUST be 31// Output Normalized: No// Output Max Magnitude: Input magnitude + 1func ( *FieldVal) ( uint32) *FieldVal {return .NegateVal(, )}// AddInt adds the passed integer to the existing field value and stores the// result in f in constant time. This is a convenience function since it is// fairly common to perform some arithmetic with small native integers.//// The field value is returned to support chaining. This enables syntax like:// f.AddInt(1).Add(f2) so that f = f + 1 + f2.//// Preconditions:// - The field value MUST have a max magnitude of 31// - The integer MUST be a max of 32767// Output Normalized: No// Output Max Magnitude: Existing field magnitude + 1func ( *FieldVal) ( uint16) *FieldVal {// Since the field representation intentionally provides overflow bits,// it's ok to use carryless addition as the carry bit is safely part of// the word and will be normalized out..n[0] += uint32()return}// Add adds the passed value to the existing field value and stores the result// in f in constant time.//// The field value is returned to support chaining. This enables syntax like:// f.Add(f2).AddInt(1) so that f = f + f2 + 1.//// Preconditions:// - The sum of the magnitudes of the two field values MUST be a max of 32// Output Normalized: No// Output Max Magnitude: Sum of the magnitude of the two individual field valuesfunc ( *FieldVal) ( *FieldVal) *FieldVal {// Since the field representation intentionally provides overflow bits,// it's ok to use carryless addition as the carry bit is safely part of// each word and will be normalized out. This could obviously be done// in a loop, but the unrolled version is faster..n[0] += .n[0].n[1] += .n[1].n[2] += .n[2].n[3] += .n[3].n[4] += .n[4].n[5] += .n[5].n[6] += .n[6].n[7] += .n[7].n[8] += .n[8].n[9] += .n[9]return}// Add2 adds the passed two field values together and stores the result in f in// constant time.//// The field value is returned to support chaining. This enables syntax like:// f3.Add2(f, f2).AddInt(1) so that f3 = f + f2 + 1.//// Preconditions:// - The sum of the magnitudes of the two field values MUST be a max of 32// Output Normalized: No// Output Max Magnitude: Sum of the magnitude of the two field valuesfunc ( *FieldVal) ( *FieldVal, *FieldVal) *FieldVal {// Since the field representation intentionally provides overflow bits,// it's ok to use carryless addition as the carry bit is safely part of// each word and will be normalized out. This could obviously be done// in a loop, but the unrolled version is faster..n[0] = .n[0] + .n[0].n[1] = .n[1] + .n[1].n[2] = .n[2] + .n[2].n[3] = .n[3] + .n[3].n[4] = .n[4] + .n[4].n[5] = .n[5] + .n[5].n[6] = .n[6] + .n[6].n[7] = .n[7] + .n[7].n[8] = .n[8] + .n[8].n[9] = .n[9] + .n[9]return}// MulInt multiplies the field value by the passed int and stores the result in// f in constant time. Note that this function can overflow if multiplying the// value by any of the individual words exceeds a max uint32. Therefore it is// important that the caller ensures no overflows will occur before using this// function.//// The field value is returned to support chaining. This enables syntax like:// f.MulInt(2).Add(f2) so that f = 2 * f + f2.//// Preconditions:// - The field value magnitude multiplied by given val MUST be a max of 32// Output Normalized: No// Output Max Magnitude: Existing field magnitude times the provided integer valfunc ( *FieldVal) ( uint8) *FieldVal {// Since each word of the field representation can hold up to// 32 - fieldBase extra bits which will be normalized out, it's safe// to multiply each word without using a larger type or carry// propagation so long as the values won't overflow a uint32. This// could obviously be done in a loop, but the unrolled version is// faster.:= uint32().n[0] *=.n[1] *=.n[2] *=.n[3] *=.n[4] *=.n[5] *=.n[6] *=.n[7] *=.n[8] *=.n[9] *=return}// Mul multiplies the passed value to the existing field value and stores the// result in f in constant time. Note that this function can overflow if// multiplying any of the individual words exceeds a max uint32. In practice,// this means the magnitude of either value involved in the multiplication must// be a max of 8.//// The field value is returned to support chaining. This enables syntax like:// f.Mul(f2).AddInt(1) so that f = (f * f2) + 1.//// Preconditions:// - Both field values MUST have a max magnitude of 8// Output Normalized: No// Output Max Magnitude: 1func ( *FieldVal) ( *FieldVal) *FieldVal {return .Mul2(, )}// Mul2 multiplies the passed two field values together and stores the result in// f in constant time. Note that this function can overflow if multiplying any// of the individual words exceeds a max uint32. In practice, this means the// magnitude of either value involved in the multiplication must be a max of 8.//// The field value is returned to support chaining. This enables syntax like:// f3.Mul2(f, f2).AddInt(1) so that f3 = (f * f2) + 1.//// Preconditions:// - Both input field values MUST have a max magnitude of 8// Output Normalized: No// Output Max Magnitude: 1func ( *FieldVal) ( *FieldVal, *FieldVal) *FieldVal {// This could be done with a couple of for loops and an array to store// the intermediate terms, but this unrolled version is significantly// faster.// Terms for 2^(fieldBase*0).:= uint64(.n[0]) * uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*1).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[1]) +uint64(.n[1])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*2).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[2]) +uint64(.n[1])*uint64(.n[1]) +uint64(.n[2])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*3).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[3]) +uint64(.n[1])*uint64(.n[2]) +uint64(.n[2])*uint64(.n[1]) +uint64(.n[3])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*4).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[4]) +uint64(.n[1])*uint64(.n[3]) +uint64(.n[2])*uint64(.n[2]) +uint64(.n[3])*uint64(.n[1]) +uint64(.n[4])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*5).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[5]) +uint64(.n[1])*uint64(.n[4]) +uint64(.n[2])*uint64(.n[3]) +uint64(.n[3])*uint64(.n[2]) +uint64(.n[4])*uint64(.n[1]) +uint64(.n[5])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*6).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[6]) +uint64(.n[1])*uint64(.n[5]) +uint64(.n[2])*uint64(.n[4]) +uint64(.n[3])*uint64(.n[3]) +uint64(.n[4])*uint64(.n[2]) +uint64(.n[5])*uint64(.n[1]) +uint64(.n[6])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*7).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[7]) +uint64(.n[1])*uint64(.n[6]) +uint64(.n[2])*uint64(.n[5]) +uint64(.n[3])*uint64(.n[4]) +uint64(.n[4])*uint64(.n[3]) +uint64(.n[5])*uint64(.n[2]) +uint64(.n[6])*uint64(.n[1]) +uint64(.n[7])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*8).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[8]) +uint64(.n[1])*uint64(.n[7]) +uint64(.n[2])*uint64(.n[6]) +uint64(.n[3])*uint64(.n[5]) +uint64(.n[4])*uint64(.n[4]) +uint64(.n[5])*uint64(.n[3]) +uint64(.n[6])*uint64(.n[2]) +uint64(.n[7])*uint64(.n[1]) +uint64(.n[8])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*9).= ( >> fieldBase) +uint64(.n[0])*uint64(.n[9]) +uint64(.n[1])*uint64(.n[8]) +uint64(.n[2])*uint64(.n[7]) +uint64(.n[3])*uint64(.n[6]) +uint64(.n[4])*uint64(.n[5]) +uint64(.n[5])*uint64(.n[4]) +uint64(.n[6])*uint64(.n[3]) +uint64(.n[7])*uint64(.n[2]) +uint64(.n[8])*uint64(.n[1]) +uint64(.n[9])*uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*10).= ( >> fieldBase) +uint64(.n[1])*uint64(.n[9]) +uint64(.n[2])*uint64(.n[8]) +uint64(.n[3])*uint64(.n[7]) +uint64(.n[4])*uint64(.n[6]) +uint64(.n[5])*uint64(.n[5]) +uint64(.n[6])*uint64(.n[4]) +uint64(.n[7])*uint64(.n[3]) +uint64(.n[8])*uint64(.n[2]) +uint64(.n[9])*uint64(.n[1]):= & fieldBaseMask// Terms for 2^(fieldBase*11).= ( >> fieldBase) +uint64(.n[2])*uint64(.n[9]) +uint64(.n[3])*uint64(.n[8]) +uint64(.n[4])*uint64(.n[7]) +uint64(.n[5])*uint64(.n[6]) +uint64(.n[6])*uint64(.n[5]) +uint64(.n[7])*uint64(.n[4]) +uint64(.n[8])*uint64(.n[3]) +uint64(.n[9])*uint64(.n[2]):= & fieldBaseMask// Terms for 2^(fieldBase*12).= ( >> fieldBase) +uint64(.n[3])*uint64(.n[9]) +uint64(.n[4])*uint64(.n[8]) +uint64(.n[5])*uint64(.n[7]) +uint64(.n[6])*uint64(.n[6]) +uint64(.n[7])*uint64(.n[5]) +uint64(.n[8])*uint64(.n[4]) +uint64(.n[9])*uint64(.n[3]):= & fieldBaseMask// Terms for 2^(fieldBase*13).= ( >> fieldBase) +uint64(.n[4])*uint64(.n[9]) +uint64(.n[5])*uint64(.n[8]) +uint64(.n[6])*uint64(.n[7]) +uint64(.n[7])*uint64(.n[6]) +uint64(.n[8])*uint64(.n[5]) +uint64(.n[9])*uint64(.n[4]):= & fieldBaseMask// Terms for 2^(fieldBase*14).= ( >> fieldBase) +uint64(.n[5])*uint64(.n[9]) +uint64(.n[6])*uint64(.n[8]) +uint64(.n[7])*uint64(.n[7]) +uint64(.n[8])*uint64(.n[6]) +uint64(.n[9])*uint64(.n[5]):= & fieldBaseMask// Terms for 2^(fieldBase*15).= ( >> fieldBase) +uint64(.n[6])*uint64(.n[9]) +uint64(.n[7])*uint64(.n[8]) +uint64(.n[8])*uint64(.n[7]) +uint64(.n[9])*uint64(.n[6]):= & fieldBaseMask// Terms for 2^(fieldBase*16).= ( >> fieldBase) +uint64(.n[7])*uint64(.n[9]) +uint64(.n[8])*uint64(.n[8]) +uint64(.n[9])*uint64(.n[7]):= & fieldBaseMask// Terms for 2^(fieldBase*17).= ( >> fieldBase) +uint64(.n[8])*uint64(.n[9]) +uint64(.n[9])*uint64(.n[8]):= & fieldBaseMask// Terms for 2^(fieldBase*18).= ( >> fieldBase) + uint64(.n[9])*uint64(.n[9]):= & fieldBaseMask// What's left is for 2^(fieldBase*19).:= >> fieldBase// At this point, all of the terms are grouped into their respective// base.//// Per [HAC] section 14.3.4: Reduction method of moduli of special form,// when the modulus is of the special form m = b^t - c, highly efficient// reduction can be achieved per the provided algorithm.//// The secp256k1 prime is equivalent to 2^256 - 4294968273, so it fits// this criteria.//// 4294968273 in field representation (base 2^26) is:// n[0] = 977// n[1] = 64// That is to say (2^26 * 64) + 977 = 4294968273//// Since each word is in base 26, the upper terms (t10 and up) start// at 260 bits (versus the final desired range of 256 bits), so the// field representation of 'c' from above needs to be adjusted for the// extra 4 bits by multiplying it by 2^4 = 16. 4294968273 * 16 =// 68719492368. Thus, the adjusted field representation of 'c' is:// n[0] = 977 * 16 = 15632// n[1] = 64 * 16 = 1024// That is to say (2^26 * 1024) + 15632 = 68719492368//// To reduce the final term, t19, the entire 'c' value is needed instead// of only n[0] because there are no more terms left to handle n[1].// This means there might be some magnitude left in the upper bits that// is handled below.= + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *68719492368= & fieldMSBMask>>= fieldMSBBits// At this point, if the magnitude is greater than 0, the overall value// is greater than the max possible 256-bit value. In particular, it is// "how many times larger" than the max value it is.//// The algorithm presented in [HAC] section 14.3.4 repeats until the// quotient is zero. However, due to the above, we already know at// least how many times we would need to repeat as it's the value// currently in m. Thus we can simply multiply the magnitude by the// field representation of the prime and do a single iteration. Notice// that nothing will be changed when the magnitude is zero, so we could// skip this in that case, however always running regardless allows it// to run in constant time. The final result will be in the range// 0 <= result <= prime + (2^64 - c), so it is guaranteed to have a// magnitude of 1, but it is denormalized.:= + *977.n[0] = uint32( & fieldBaseMask)= ( >> fieldBase) + + *64.n[1] = uint32( & fieldBaseMask).n[2] = uint32(( >> fieldBase) + ).n[3] = uint32().n[4] = uint32().n[5] = uint32().n[6] = uint32().n[7] = uint32().n[8] = uint32().n[9] = uint32()return}// SquareRootVal either calculates the square root of the passed value when it// exists or the square root of the negation of the value when it does not exist// and stores the result in f in constant time. The return flag is true when// the calculated square root is for the passed value itself and false when it// is for its negation.//// Note that this function can overflow if multiplying any of the individual// words exceeds a max uint32. In practice, this means the magnitude of the// field must be a max of 8 to prevent overflow. The magnitude of the result// will be 1.//// Preconditions:// - The input field value MUST have a max magnitude of 8// Output Normalized: No// Output Max Magnitude: 1func ( *FieldVal) ( *FieldVal) bool {// This uses the Tonelli-Shanks method for calculating the square root of// the value when it exists. The key principles of the method follow.//// Fermat's little theorem states that for a nonzero number 'a' and prime// 'p', a^(p-1) ≡ 1 (mod p).//// Further, Euler's criterion states that an integer 'a' has a square root// (aka is a quadratic residue) modulo a prime if a^((p-1)/2) ≡ 1 (mod p)// and, conversely, when it does NOT have a square root (aka 'a' is a// non-residue) a^((p-1)/2) ≡ -1 (mod p).//// This can be seen by considering that Fermat's little theorem can be// written as (a^((p-1)/2) - 1)(a^((p-1)/2) + 1) ≡ 0 (mod p). Therefore,// one of the two factors must be 0. Then, when a ≡ x^2 (aka 'a' is a// quadratic residue), (x^2)^((p-1)/2) ≡ x^(p-1) ≡ 1 (mod p) which implies// the first factor must be zero. Finally, per Lagrange's theorem, the// non-residues are the only remaining possible solutions and thus must make// the second factor zero to satisfy Fermat's little theorem implying that// a^((p-1)/2) ≡ -1 (mod p) for that case.//// The Tonelli-Shanks method uses these facts along with factoring out// powers of two to solve a congruence that results in either the solution// when the square root exists or the square root of the negation of the// value when it does not. In the case of primes that are ≡ 3 (mod 4), the// possible solutions are r = ±a^((p+1)/4) (mod p). Therefore, either r^2 ≡// a (mod p) is true in which case ±r are the two solutions, or r^2 ≡ -a// (mod p) in which case 'a' is a non-residue and there are no solutions.//// The secp256k1 prime is ≡ 3 (mod 4), so this result applies.//// In other words, calculate a^((p+1)/4) and then square it and check it// against the original value to determine if it is actually the square// root.//// In order to efficiently compute a^((p+1)/4), (p+1)/4 needs to be split// into a sequence of squares and multiplications that minimizes the number// of multiplications needed (since they are more costly than squarings).//// The secp256k1 prime + 1 / 4 is 2^254 - 2^30 - 244. In binary, that is://// 00111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 10111111 11111111 11111111 00001100//// Notice that can be broken up into three windows of consecutive 1s (in// order of least to most significant) as://// 6-bit window with two bits set (bits 4, 5, 6, 7 unset)// 23-bit window with 22 bits set (bit 30 unset)// 223-bit window with all 223 bits set//// Thus, the groups of 1 bits in each window forms the set:// S = {2, 22, 223}.//// The strategy is to calculate a^(2^n - 1) for each grouping via an// addition chain with a sliding window.//// The addition chain used is (credits to Peter Dettman):// (0,0),(1,0),(2,2),(3,2),(4,1),(5,5),(6,6),(7,7),(8,8),(9,7),(10,2)// => 2^1 2^[2] 2^3 2^6 2^9 2^11 2^[22] 2^44 2^88 2^176 2^220 2^[223]//// This has a cost of 254 field squarings and 13 field multiplications.var , , , , , , , , , , , FieldVal.Set().SquareVal(&).Mul(&) // a2 = a^(2^2 - 1).SquareVal(&).Mul(&) // a3 = a^(2^3 - 1).SquareVal(&).Square().Square() // a6 = a^(2^6 - 2^3).Mul(&) // a6 = a^(2^6 - 1).SquareVal(&).Square().Square() // a9 = a^(2^9 - 2^3).Mul(&) // a9 = a^(2^9 - 1).SquareVal(&).Square() // a11 = a^(2^11 - 2^2).Mul(&) // a11 = a^(2^11 - 1).SquareVal(&).Square().Square().Square().Square() // a22 = a^(2^16 - 2^5).Square().Square().Square().Square().Square() // a22 = a^(2^21 - 2^10).Square() // a22 = a^(2^22 - 2^11).Mul(&) // a22 = a^(2^22 - 1).SquareVal(&).Square().Square().Square().Square() // a44 = a^(2^27 - 2^5).Square().Square().Square().Square().Square() // a44 = a^(2^32 - 2^10).Square().Square().Square().Square().Square() // a44 = a^(2^37 - 2^15).Square().Square().Square().Square().Square() // a44 = a^(2^42 - 2^20).Square().Square() // a44 = a^(2^44 - 2^22).Mul(&) // a44 = a^(2^44 - 1).SquareVal(&).Square().Square().Square().Square() // a88 = a^(2^49 - 2^5).Square().Square().Square().Square().Square() // a88 = a^(2^54 - 2^10).Square().Square().Square().Square().Square() // a88 = a^(2^59 - 2^15).Square().Square().Square().Square().Square() // a88 = a^(2^64 - 2^20).Square().Square().Square().Square().Square() // a88 = a^(2^69 - 2^25).Square().Square().Square().Square().Square() // a88 = a^(2^74 - 2^30).Square().Square().Square().Square().Square() // a88 = a^(2^79 - 2^35).Square().Square().Square().Square().Square() // a88 = a^(2^84 - 2^40).Square().Square().Square().Square() // a88 = a^(2^88 - 2^44).Mul(&) // a88 = a^(2^88 - 1).SquareVal(&).Square().Square().Square().Square() // a176 = a^(2^93 - 2^5).Square().Square().Square().Square().Square() // a176 = a^(2^98 - 2^10).Square().Square().Square().Square().Square() // a176 = a^(2^103 - 2^15).Square().Square().Square().Square().Square() // a176 = a^(2^108 - 2^20).Square().Square().Square().Square().Square() // a176 = a^(2^113 - 2^25).Square().Square().Square().Square().Square() // a176 = a^(2^118 - 2^30).Square().Square().Square().Square().Square() // a176 = a^(2^123 - 2^35).Square().Square().Square().Square().Square() // a176 = a^(2^128 - 2^40).Square().Square().Square().Square().Square() // a176 = a^(2^133 - 2^45).Square().Square().Square().Square().Square() // a176 = a^(2^138 - 2^50).Square().Square().Square().Square().Square() // a176 = a^(2^143 - 2^55).Square().Square().Square().Square().Square() // a176 = a^(2^148 - 2^60).Square().Square().Square().Square().Square() // a176 = a^(2^153 - 2^65).Square().Square().Square().Square().Square() // a176 = a^(2^158 - 2^70).Square().Square().Square().Square().Square() // a176 = a^(2^163 - 2^75).Square().Square().Square().Square().Square() // a176 = a^(2^168 - 2^80).Square().Square().Square().Square().Square() // a176 = a^(2^173 - 2^85).Square().Square().Square() // a176 = a^(2^176 - 2^88).Mul(&) // a176 = a^(2^176 - 1).SquareVal(&).Square().Square().Square().Square() // a220 = a^(2^181 - 2^5).Square().Square().Square().Square().Square() // a220 = a^(2^186 - 2^10).Square().Square().Square().Square().Square() // a220 = a^(2^191 - 2^15).Square().Square().Square().Square().Square() // a220 = a^(2^196 - 2^20).Square().Square().Square().Square().Square() // a220 = a^(2^201 - 2^25).Square().Square().Square().Square().Square() // a220 = a^(2^206 - 2^30).Square().Square().Square().Square().Square() // a220 = a^(2^211 - 2^35).Square().Square().Square().Square().Square() // a220 = a^(2^216 - 2^40).Square().Square().Square().Square() // a220 = a^(2^220 - 2^44).Mul(&) // a220 = a^(2^220 - 1).SquareVal(&).Square().Square() // a223 = a^(2^223 - 2^3).Mul(&) // a223 = a^(2^223 - 1).SquareVal(&).Square().Square().Square().Square() // f = a^(2^228 - 2^5).Square().Square().Square().Square().Square() // f = a^(2^233 - 2^10).Square().Square().Square().Square().Square() // f = a^(2^238 - 2^15).Square().Square().Square().Square().Square() // f = a^(2^243 - 2^20).Square().Square().Square() // f = a^(2^246 - 2^23).Mul(&) // f = a^(2^246 - 2^22 - 1).Square().Square().Square().Square().Square() // f = a^(2^251 - 2^27 - 2^5).Square() // f = a^(2^252 - 2^28 - 2^6).Mul(&) // f = a^(2^252 - 2^28 - 2^6 - 2^1 - 1).Square().Square() // f = a^(2^254 - 2^30 - 2^8 - 2^3 - 2^2)// // = a^(2^254 - 2^30 - 244)// // = a^((p+1)/4)// Ensure the calculated result is actually the square root by squaring it// and checking against the original value.var FieldValreturn .SquareVal().Normalize().Equals(.Normalize())}// Square squares the field value in constant time. The existing field value is// modified. Note that this function can overflow if multiplying any of the// individual words exceeds a max uint32. In practice, this means the magnitude// of the field must be a max of 8 to prevent overflow.//// The field value is returned to support chaining. This enables syntax like:// f.Square().Mul(f2) so that f = f^2 * f2.//// Preconditions:// - The field value MUST have a max magnitude of 8// Output Normalized: No// Output Max Magnitude: 1func ( *FieldVal) () *FieldVal {return .SquareVal()}// SquareVal squares the passed value and stores the result in f in constant// time. Note that this function can overflow if multiplying any of the// individual words exceeds a max uint32. In practice, this means the magnitude// of the field being squared must be a max of 8 to prevent overflow.//// The field value is returned to support chaining. This enables syntax like:// f3.SquareVal(f).Mul(f) so that f3 = f^2 * f = f^3.//// Preconditions:// - The input field value MUST have a max magnitude of 8// Output Normalized: No// Output Max Magnitude: 1func ( *FieldVal) ( *FieldVal) *FieldVal {// This could be done with a couple of for loops and an array to store// the intermediate terms, but this unrolled version is significantly// faster.// Terms for 2^(fieldBase*0).:= uint64(.n[0]) * uint64(.n[0]):= & fieldBaseMask// Terms for 2^(fieldBase*1).= ( >> fieldBase) + 2*uint64(.n[0])*uint64(.n[1]):= & fieldBaseMask// Terms for 2^(fieldBase*2).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[2]) +uint64(.n[1])*uint64(.n[1]):= & fieldBaseMask// Terms for 2^(fieldBase*3).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[3]) +2*uint64(.n[1])*uint64(.n[2]):= & fieldBaseMask// Terms for 2^(fieldBase*4).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[4]) +2*uint64(.n[1])*uint64(.n[3]) +uint64(.n[2])*uint64(.n[2]):= & fieldBaseMask// Terms for 2^(fieldBase*5).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[5]) +2*uint64(.n[1])*uint64(.n[4]) +2*uint64(.n[2])*uint64(.n[3]):= & fieldBaseMask// Terms for 2^(fieldBase*6).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[6]) +2*uint64(.n[1])*uint64(.n[5]) +2*uint64(.n[2])*uint64(.n[4]) +uint64(.n[3])*uint64(.n[3]):= & fieldBaseMask// Terms for 2^(fieldBase*7).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[7]) +2*uint64(.n[1])*uint64(.n[6]) +2*uint64(.n[2])*uint64(.n[5]) +2*uint64(.n[3])*uint64(.n[4]):= & fieldBaseMask// Terms for 2^(fieldBase*8).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[8]) +2*uint64(.n[1])*uint64(.n[7]) +2*uint64(.n[2])*uint64(.n[6]) +2*uint64(.n[3])*uint64(.n[5]) +uint64(.n[4])*uint64(.n[4]):= & fieldBaseMask// Terms for 2^(fieldBase*9).= ( >> fieldBase) +2*uint64(.n[0])*uint64(.n[9]) +2*uint64(.n[1])*uint64(.n[8]) +2*uint64(.n[2])*uint64(.n[7]) +2*uint64(.n[3])*uint64(.n[6]) +2*uint64(.n[4])*uint64(.n[5]):= & fieldBaseMask// Terms for 2^(fieldBase*10).= ( >> fieldBase) +2*uint64(.n[1])*uint64(.n[9]) +2*uint64(.n[2])*uint64(.n[8]) +2*uint64(.n[3])*uint64(.n[7]) +2*uint64(.n[4])*uint64(.n[6]) +uint64(.n[5])*uint64(.n[5]):= & fieldBaseMask// Terms for 2^(fieldBase*11).= ( >> fieldBase) +2*uint64(.n[2])*uint64(.n[9]) +2*uint64(.n[3])*uint64(.n[8]) +2*uint64(.n[4])*uint64(.n[7]) +2*uint64(.n[5])*uint64(.n[6]):= & fieldBaseMask// Terms for 2^(fieldBase*12).= ( >> fieldBase) +2*uint64(.n[3])*uint64(.n[9]) +2*uint64(.n[4])*uint64(.n[8]) +2*uint64(.n[5])*uint64(.n[7]) +uint64(.n[6])*uint64(.n[6]):= & fieldBaseMask// Terms for 2^(fieldBase*13).= ( >> fieldBase) +2*uint64(.n[4])*uint64(.n[9]) +2*uint64(.n[5])*uint64(.n[8]) +2*uint64(.n[6])*uint64(.n[7]):= & fieldBaseMask// Terms for 2^(fieldBase*14).= ( >> fieldBase) +2*uint64(.n[5])*uint64(.n[9]) +2*uint64(.n[6])*uint64(.n[8]) +uint64(.n[7])*uint64(.n[7]):= & fieldBaseMask// Terms for 2^(fieldBase*15).= ( >> fieldBase) +2*uint64(.n[6])*uint64(.n[9]) +2*uint64(.n[7])*uint64(.n[8]):= & fieldBaseMask// Terms for 2^(fieldBase*16).= ( >> fieldBase) +2*uint64(.n[7])*uint64(.n[9]) +uint64(.n[8])*uint64(.n[8]):= & fieldBaseMask// Terms for 2^(fieldBase*17).= ( >> fieldBase) + 2*uint64(.n[8])*uint64(.n[9]):= & fieldBaseMask// Terms for 2^(fieldBase*18).= ( >> fieldBase) + uint64(.n[9])*uint64(.n[9]):= & fieldBaseMask// What's left is for 2^(fieldBase*19).:= >> fieldBase// At this point, all of the terms are grouped into their respective// base.//// Per [HAC] section 14.3.4: Reduction method of moduli of special form,// when the modulus is of the special form m = b^t - c, highly efficient// reduction can be achieved per the provided algorithm.//// The secp256k1 prime is equivalent to 2^256 - 4294968273, so it fits// this criteria.//// 4294968273 in field representation (base 2^26) is:// n[0] = 977// n[1] = 64// That is to say (2^26 * 64) + 977 = 4294968273//// Since each word is in base 26, the upper terms (t10 and up) start// at 260 bits (versus the final desired range of 256 bits), so the// field representation of 'c' from above needs to be adjusted for the// extra 4 bits by multiplying it by 2^4 = 16. 4294968273 * 16 =// 68719492368. Thus, the adjusted field representation of 'c' is:// n[0] = 977 * 16 = 15632// n[1] = 64 * 16 = 1024// That is to say (2^26 * 1024) + 15632 = 68719492368//// To reduce the final term, t19, the entire 'c' value is needed instead// of only n[0] because there are no more terms left to handle n[1].// This means there might be some magnitude left in the upper bits that// is handled below.= + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *15632= & fieldBaseMask= ( >> fieldBase) + + *1024 + *68719492368= & fieldMSBMask>>= fieldMSBBits// At this point, if the magnitude is greater than 0, the overall value// is greater than the max possible 256-bit value. In particular, it is// "how many times larger" than the max value it is.//// The algorithm presented in [HAC] section 14.3.4 repeats until the// quotient is zero. However, due to the above, we already know at// least how many times we would need to repeat as it's the value// currently in m. Thus we can simply multiply the magnitude by the// field representation of the prime and do a single iteration. Notice// that nothing will be changed when the magnitude is zero, so we could// skip this in that case, however always running regardless allows it// to run in constant time. The final result will be in the range// 0 <= result <= prime + (2^64 - c), so it is guaranteed to have a// magnitude of 1, but it is denormalized.:= + *977.n[0] = uint32( & fieldBaseMask)= ( >> fieldBase) + + *64.n[1] = uint32( & fieldBaseMask).n[2] = uint32(( >> fieldBase) + ).n[3] = uint32().n[4] = uint32().n[5] = uint32().n[6] = uint32().n[7] = uint32().n[8] = uint32().n[9] = uint32()return}// Inverse finds the modular multiplicative inverse of the field value in// constant time. The existing field value is modified.//// The field value is returned to support chaining. This enables syntax like:// f.Inverse().Mul(f2) so that f = f^-1 * f2.//// Preconditions:// - The field value MUST have a max magnitude of 8// Output Normalized: No// Output Max Magnitude: 1func ( *FieldVal) () *FieldVal {// Fermat's little theorem states that for a nonzero number 'a' and prime// 'p', a^(p-1) ≡ 1 (mod p). Multiplying both sides of the equation by the// multiplicative inverse a^-1 yields a^(p-2) ≡ a^-1 (mod p). Thus, a^(p-2)// is the multiplicative inverse.//// In order to efficiently compute a^(p-2), p-2 needs to be split into a// sequence of squares and multiplications that minimizes the number of// multiplications needed (since they are more costly than squarings).// Intermediate results are saved and reused as well.//// The secp256k1 prime - 2 is 2^256 - 4294968275. In binary, that is://// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111111// 11111111 11111111 11111111 11111110// 11111111 11111111 11111100 00101101//// Notice that can be broken up into five windows of consecutive 1s (in// order of least to most significant) as://// 2-bit window with 1 bit set (bit 1 unset)// 3-bit window with 2 bits set (bit 4 unset)// 5-bit window with 1 bit set (bits 6, 7, 8, 9 unset)// 23-bit window with 22 bits set (bit 32 unset)// 223-bit window with all 223 bits set//// Thus, the groups of 1 bits in each window forms the set:// S = {1, 2, 22, 223}.//// The strategy is to calculate a^(2^n - 1) for each grouping via an// addition chain with a sliding window.//// The addition chain used is (credits to Peter Dettman):// (0,0),(1,0),(2,2),(3,2),(4,1),(5,5),(6,6),(7,7),(8,8),(9,7),(10,2)// => 2^[1] 2^[2] 2^3 2^6 2^9 2^11 2^[22] 2^44 2^88 2^176 2^220 2^[223]//// This has a cost of 255 field squarings and 15 field multiplications.var , , , , , , , , , , , FieldVal.Set().SquareVal(&).Mul(&) // a2 = a^(2^2 - 1).SquareVal(&).Mul(&) // a3 = a^(2^3 - 1).SquareVal(&).Square().Square() // a6 = a^(2^6 - 2^3).Mul(&) // a6 = a^(2^6 - 1).SquareVal(&).Square().Square() // a9 = a^(2^9 - 2^3).Mul(&) // a9 = a^(2^9 - 1).SquareVal(&).Square() // a11 = a^(2^11 - 2^2).Mul(&) // a11 = a^(2^11 - 1).SquareVal(&).Square().Square().Square().Square() // a22 = a^(2^16 - 2^5).Square().Square().Square().Square().Square() // a22 = a^(2^21 - 2^10).Square() // a22 = a^(2^22 - 2^11).Mul(&) // a22 = a^(2^22 - 1).SquareVal(&).Square().Square().Square().Square() // a44 = a^(2^27 - 2^5).Square().Square().Square().Square().Square() // a44 = a^(2^32 - 2^10).Square().Square().Square().Square().Square() // a44 = a^(2^37 - 2^15).Square().Square().Square().Square().Square() // a44 = a^(2^42 - 2^20).Square().Square() // a44 = a^(2^44 - 2^22).Mul(&) // a44 = a^(2^44 - 1).SquareVal(&).Square().Square().Square().Square() // a88 = a^(2^49 - 2^5).Square().Square().Square().Square().Square() // a88 = a^(2^54 - 2^10).Square().Square().Square().Square().Square() // a88 = a^(2^59 - 2^15).Square().Square().Square().Square().Square() // a88 = a^(2^64 - 2^20).Square().Square().Square().Square().Square() // a88 = a^(2^69 - 2^25).Square().Square().Square().Square().Square() // a88 = a^(2^74 - 2^30).Square().Square().Square().Square().Square() // a88 = a^(2^79 - 2^35).Square().Square().Square().Square().Square() // a88 = a^(2^84 - 2^40).Square().Square().Square().Square() // a88 = a^(2^88 - 2^44).Mul(&) // a88 = a^(2^88 - 1).SquareVal(&).Square().Square().Square().Square() // a176 = a^(2^93 - 2^5).Square().Square().Square().Square().Square() // a176 = a^(2^98 - 2^10).Square().Square().Square().Square().Square() // a176 = a^(2^103 - 2^15).Square().Square().Square().Square().Square() // a176 = a^(2^108 - 2^20).Square().Square().Square().Square().Square() // a176 = a^(2^113 - 2^25).Square().Square().Square().Square().Square() // a176 = a^(2^118 - 2^30).Square().Square().Square().Square().Square() // a176 = a^(2^123 - 2^35).Square().Square().Square().Square().Square() // a176 = a^(2^128 - 2^40).Square().Square().Square().Square().Square() // a176 = a^(2^133 - 2^45).Square().Square().Square().Square().Square() // a176 = a^(2^138 - 2^50).Square().Square().Square().Square().Square() // a176 = a^(2^143 - 2^55).Square().Square().Square().Square().Square() // a176 = a^(2^148 - 2^60).Square().Square().Square().Square().Square() // a176 = a^(2^153 - 2^65).Square().Square().Square().Square().Square() // a176 = a^(2^158 - 2^70).Square().Square().Square().Square().Square() // a176 = a^(2^163 - 2^75).Square().Square().Square().Square().Square() // a176 = a^(2^168 - 2^80).Square().Square().Square().Square().Square() // a176 = a^(2^173 - 2^85).Square().Square().Square() // a176 = a^(2^176 - 2^88).Mul(&) // a176 = a^(2^176 - 1).SquareVal(&).Square().Square().Square().Square() // a220 = a^(2^181 - 2^5).Square().Square().Square().Square().Square() // a220 = a^(2^186 - 2^10).Square().Square().Square().Square().Square() // a220 = a^(2^191 - 2^15).Square().Square().Square().Square().Square() // a220 = a^(2^196 - 2^20).Square().Square().Square().Square().Square() // a220 = a^(2^201 - 2^25).Square().Square().Square().Square().Square() // a220 = a^(2^206 - 2^30).Square().Square().Square().Square().Square() // a220 = a^(2^211 - 2^35).Square().Square().Square().Square().Square() // a220 = a^(2^216 - 2^40).Square().Square().Square().Square() // a220 = a^(2^220 - 2^44).Mul(&) // a220 = a^(2^220 - 1).SquareVal(&).Square().Square() // a223 = a^(2^223 - 2^3).Mul(&) // a223 = a^(2^223 - 1).SquareVal(&).Square().Square().Square().Square() // f = a^(2^228 - 2^5).Square().Square().Square().Square().Square() // f = a^(2^233 - 2^10).Square().Square().Square().Square().Square() // f = a^(2^238 - 2^15).Square().Square().Square().Square().Square() // f = a^(2^243 - 2^20).Square().Square().Square() // f = a^(2^246 - 2^23).Mul(&) // f = a^(2^246 - 4194305).Square().Square().Square().Square().Square() // f = a^(2^251 - 134217760).Mul(&) // f = a^(2^251 - 134217759).Square().Square().Square() // f = a^(2^254 - 1073742072).Mul(&) // f = a^(2^254 - 1073742069).Square().Square() // f = a^(2^256 - 4294968276)return .Mul(&) // f = a^(2^256 - 4294968275) = a^(p-2)}// IsGtOrEqPrimeMinusOrder returns whether or not the field value is greater// than or equal to the field prime minus the secp256k1 group order in constant// time.//// Preconditions:// - The field value MUST be normalizedfunc ( *FieldVal) () bool {// The secp256k1 prime is equivalent to 2^256 - 4294968273 and the group// order is 2^256 - 432420386565659656852420866394968145599. Thus,// the prime minus the group order is:// 432420386565659656852420866390673177326//// In hex that is:// 0x00000000 00000000 00000000 00000001 45512319 50b75fc4 402da172 2fc9baee//// Converting that to field representation (base 2^26) is://// n[0] = 0x03c9baee// n[1] = 0x03685c8b// n[2] = 0x01fc4402// n[3] = 0x006542dd// n[4] = 0x01455123//// This can be verified with the following test code:// pMinusN := new(big.Int).Sub(curveParams.P, curveParams.N)// var fv FieldVal// fv.SetByteSlice(pMinusN.Bytes())// t.Logf("%x", fv.n)//// Outputs: [3c9baee 3685c8b 1fc4402 6542dd 1455123 0 0 0 0 0]const (= 0x03c9baee= 0x03685c8b= 0x01fc4402= 0x006542dd= 0x01455123= 0x00000000= 0x00000000= 0x00000000= 0x00000000= 0x00000000)// The intuition here is that the value is greater than field prime minus// the group order if one of the higher individual words is greater than the// corresponding word and all higher words in the value are equal.:= constantTimeGreater(.n[9], ):= constantTimeEq(.n[9], )|= & constantTimeGreater(.n[8], )&= constantTimeEq(.n[8], )|= & constantTimeGreater(.n[7], )&= constantTimeEq(.n[7], )|= & constantTimeGreater(.n[6], )&= constantTimeEq(.n[6], )|= & constantTimeGreater(.n[5], )&= constantTimeEq(.n[5], )|= & constantTimeGreater(.n[4], )&= constantTimeEq(.n[4], )|= & constantTimeGreater(.n[3], )&= constantTimeEq(.n[3], )|= & constantTimeGreater(.n[2], )&= constantTimeEq(.n[2], )|= & constantTimeGreater(.n[1], )&= constantTimeEq(.n[1], )|= & constantTimeGreaterOrEq(.n[0], )return != 0}
![]() |
The pages are generated with Golds v0.8.2. (GOOS=linux GOARCH=amd64) Golds is a Go 101 project developed by Tapir Liu. PR and bug reports are welcome and can be submitted to the issue list. Please follow @zigo_101 (reachable from the left QR code) to get the latest news of Golds. |