[ad_1]

I’ve spent a good bit of time serious about this and may present a reasonably complete reply concerning the cutting-edge. However sadly the TL;DR is that we do not know the way to do that (but) in fewer than many months of full-time tedious computation, the place checking your work is the same period of time. Close to the underside of this submit there’s a daring paragraph with particular numbers, which have a *lot* of hand-waving in them however I believe characterize the proper order of magnitude.

“Not exposing your personal key to any {hardware}” is a reasonably widespread and comprehensible objective. And there are some operations that (it seems) are completely affordable to do by hand. Specifically, you are able to do Shamir Secret Sharing and compute sure checksums by hand. (See codex32 and BIP 93 for extra data on these. See seedxor for an easier method that doesn’t embody checksumming. For random quantity era there are a selection of “diceware” schemes for extracting information from cube rolls.)

Working with an 128-bit seed, in my expertise it takes 5-10 minutes to do quite simple operations (including seeds collectively character-by-character, or multiplying them by mounted Lagrange multipliers throughout secret sharing) and 30-60 minutes to do extra complicated ones (computing or verifying BCH checksums).

Nevertheless, what the above schemes all have in widespread is that they can break seed information down right into a sequence of characters, after which do computations on them character-by-character. Because of this the operations contain tediously repeating steps that are straightforward to explain and grasp, and straightforward to help by way of lookup tables and paper computer systems.

codex32/bip93, by having a checksum, lets you verify all of your by-hand work by filling in a “checksum worksheet” in your outcomes. You may even zero in on the precise location of your errors doing this. (To verify your work on the checksum worksheets themselves, it’s important to redo them.)

Okay. So in case you are merely attempting to retailer your seed and confirm its integrity, that is “straightforward” and you’ll observe the hyperlinks I offered to see tips on how to do it. However there are two tougher issues that you simply wish to do:

- Hash secret information, or in any other case “stretch” your base entropy to get a number of secret keys, a la BIP32.
- Given a secret key, derive a public key from it. (That is your particular query, so I apologize for giving such a common reply, however I believe this further context is useful.) The jargon for this operation is an “EC mult”, a time period that I will unpack a bit under.

**For step 1**, in 2014 Ken Shirrif famously began to compute a sha256 by hand and estimated that it might take him 1.5 days of steady work to do the total double-sha256 wanted for mining. It could be a comparable quantity of labor to do a BIP32 derivation. Luckily, if all you need are HD keys, I believe a chacha20-based method could possibly be devised which is would in all probability take round 4 hours. Although in case you observe that hyperlink, you may see there are many lacking particulars right here; that is simply an extrapolation from some less complicated computations, and since even 4 hours per secret’s fairly brutal, individuals wish to discover a higher scheme earlier than they put an excessive amount of time into enjoying with this.

In contrast to with secret sharing and associated schemes, there’s no checksum that might be preserved when doing sha2 or chacha20. So to verify your work on this, you may should redo your entire computation.

**For step 2**, secret to public key derivation issues get fascinating. As you might or might not know, this single operation is ample to derive addresses (previous to Taproot, you’d additionally have to compute at the very least one hash, however with Taproot you possibly can pretty-much simply use a key as an handle) (plus a checksum, nevertheless it’s a BCH code so you should use the bip93 course of to compute that by hand). It is *additionally* ample to compute signatures (the arduous a part of producing a Taproot/Schnorr signature is an EC multiplication; there may be additionally a area multiplication and area addition, however as we’ll see, if you are able to do an EC mult, you’ve got essentially found out how to do that).

So to reply your “how do I verify my work when doing an EC mult” query, the reply is: you compute a signature along with your derived public key. Signatures are public so you should use a pc to verify it. Additionally, you will wish to use a pc to compute the message hash[*]. In actual fact, Bitcoin Core does this when it derives addresses: it indicators a dummy message with each key and verifies it, to make it possible for nothing glitched in its key derivation logic.

Okay, so, what’s concerned in a secret key to public key derivation? Basically, what you do is interpret your secret key as an integer $n$, take a particular elliptic curve (EC) level known as *G*, and compute *n* × *G* which is outlined as *G* added to itself *n* instances. And “added to” means the elliptic curve group operation which is a ratio of polynomials. These polynomials are in a area of order 2^256 – 2^32 – 977, so each addition and multiplication entails manipulating 256-bit numbers modulo a big prime.

Maaaaybe you possibly can specific an EC multiplication in a approach that did not appear to be a collection of polynomials over this area. However in case you found such a factor I think you’d have solved the discrete log downside, during which case this whole dialogue is moot. So I will assume for the remainder of this reply that we’re caught within the “polynomials in a finite area” paradigm. I will additional assume that *division* in a finite area might be very sluggish, at the very least ten instances as sluggish as multiplication, during which case you wish to characterize your factors in “Jacobian coordinates”, which is a particular coordinate system during which your factors are represented by 3 numbers fairly than 2, and the place you possibly can all the time exchange divisions by (a few) multiplications. I haven’t got an excellent quotation for Jacobian coordinates; they aren’t the identical as “Jacobi coordinates” that Google will discover for you. So far as I am conscious, they’re used closely within the area of computing elliptic curve math, and pretty-much unused all over the place else.

This is identical paradigm as digital computer systems should work in, so we really know tips on how to do the minimal variety of area operations for every EC group operation. The reply is 6-16 area multiplications, relying on how “appropriate” the factors you are combining are (whether or not their third coordinates match (good), or are one (higher)).

**Placing this all collectively**, there are three primary methods for effectively doing EC mults:

- Decreasing the variety of group operations (EC additions) it’s important to do.
- Decreasing the variety of area operations (addition/multiplication modulo a primary) it’s important to do, per group operation.
- Doing the sector operations quicker.

**For step 1**, once more assuming no discrete-log-breaking ranges of novel math, the one approach to scale back the variety of group operations is to introduce lookup tables. When computing $G$ plus itself $n$ instances, clearly we’re not going to truly do $n$-many additions. That will be ~2^255 additions. As a substitute, we might characterize $n$ is base-16, say, in order that we’ve got an easier equation that appears like

```
n_0 + n_1×(2^4)G + n_2×(2^8)G + n_3×(2^12)G + ... + n_63×(2^252)G
```

For every time period there are solely 16 possibilties; these do not overlap so you may be including from a set of dimension 16×64 = 1024. So in case you had a lookup desk with 1024 entries, you possibly can do that in 64 multiplications. Somewhat than utilizing base 16, you may use base 1024; you then’d should do 26 additions from a lookup desk of dimension 26×1024 = 26624.

I used to be capable of match 18 per web page in my “desk of discrete logarithms”, which I put collectively in 2018 or so, then considering it was a joke; I wasted plenty of area, however you possibly can inform that it would be arduous to readably get greater than 64 on a web page, as a easy estimate. So as an example 64 per web page. Then 26624 entries is a 416-page tome. There are diminishing returns to attempting to go additional and I believe this line of research is just about full.

**For step 2**, decreasing the variety of area operations, we observe from the EFD web page I linked above that we are able to scale back the variety of operations we’ve got to do by selecting the factors that we’re including rigorously. Since even addition could have one level from the lookup desk, we are able to make it possible for all of the factors within the lookup desk have z=1; if we have to work in Fourier area or in Montgomery type, we are able to additionally do these conversions within the lookup desk fairly than making the person do them by hand[**]. Simply with z=1 we are able to scale back each EC operation to 11 multiplications.

Peter Dettman has a scheme known as “efficient affine” which could possibly be used to deal with different factors as having z=1 even when they do not, nevertheless it’s not clearly relevant to this.

**Step 3** has essentially the most room for exploration in my opinion, as a result of right here is the place the hand-computation story actually diverges from the electronic-computation story, and we are able to now not rely upon present analysis to get turnkey best-in-class algorithms.

Let’s begin working issues out intimately. We’ll assume that we’re working in base-32 and that our area parts encompass 52 bech32 “digits” on this base. The good thing about working on this base is that we are able to do digit multiplication and addition utilizing 1024-element volvelles that are a very error-resistant type of lookup desk. If we used a decrease base, we would be doing further work, and with a better base, the volvelles can be too massive. Although possibly there may be room to fudge this.

Addition modulo a 256-bit prime will contain 52 digit additions, then one other 52 to scale back. As a result of additions for computer systems are principally free, I haven’t got easy-at-hand estimates for the variety of additions we’ve got to do per level, however as an example 10 as a swag. So 1040 operations.

We then should do 11 area multiplications. Naively multiplying two 52-digit numbers entails 2704 digit operations. However with Karatsuba multiplication we are able to scale back this to round 1500 (if I am remembering proper). With Fourier remodel strategies we may scale back this to more-like 52 operations, however each few multiplications we would want to maneuver out of Fourier area, an operation which prices 10000s of digit operations. I haven’t got an excellent estimate of how typically it’s important to do that; it could possibly be anyplace from “it’s important to do it so incessantly that is ineffective” to “so occasionally that this can be a huge win”.

Sadly these algorithms are for *integer* multiplication, and we wish to do *modular* multiplication. Naively this might be horrible and contain Euclid’s algorithm and dozens of digit divisions and stuff. However non-naively we are able to use Montgomery multiplication to do that, which roughly provides the overhead of multiplying every digit by a hard and fast precomputed issue. The digits within the precomp desk we may pre-multiply so that is free. For the opposite digits, we’ll additionally assume that is free, as a result of multiplication by a hard and fast fixed may be carried out very cheaply utilizing lookup tables, and I have never but thought by way of the small print.

A number of individuals have explored combining Montgomery multiplication with Karatsuba-like strategies and even Fourier strategies. Sadly the Karatsuba stuff seems to not be definitely worth the overhead till you get to numbers a good bit larger than 256 bits, and the Fourier stuff not til you are *approach* larger. So the literature on that is few and much between and is written in bizarre jargon for bizarre particular purposes like digital sign processing chips, or Nineteen Fifties-era computer systems, or no matter. I can really feel in my bones that there is some actually massive consequence (“massive” for individuals attempting to do bitcoin wallets by hand) in right here however the limiting agent right here is deep research time. I’d encourage anybody studying this to get their fingers soiled and to open a dialogue on the codex32 github repo in the event that they’re making progress. There are different extra speculative concepts (e.g. utilizing residue quantity techniques after which representing entries within the lookup desk as their residue modulo numerous small primes …. if we may then do multiplication/division modulo small primes, then immediately digit division turns into simply as low cost as digit multiplication, which opens up new avenues) that I have never seen within the literature however might lead someplace helpful.

**Placing this all collectively, right here is the place we stand: with a 400-page lookup desk, assuming no overhead (protecting monitor of carries, copying information from line to line, and many others) and assuming volvelles for each operation, we are able to do 26 group operations, totalling 286 area multiplications and 260 area additions, which break down into ~786k digit operations (naively) or ~440k (assuming some Karatsuba-like stuff works). To illustrate we are able to do 5 digit operations per minute. Then even in any case this hand-waving we’re nonetheless 2622 hours of labor naively or 1475 hours. At a 40 hour/week that is 65 weeks naively or 36 weeks with the methods that we’re pretty assured about.**

For my part, if we may scale back this to at least one month (160 hours of labor) (plus one other 160 to confirm a signature, in case you’re doing this to derive addresses), I believe this might be an inexpensive factor to do for a sure type of super-paranoid Bitcoin person who solely transacted each a number of years. Maybe you possibly can function a chilly pockets this fashion, and train your kids and grandchildren to do it, and over the centuries it would not add as much as an excessive amount of hassle. I am additionally skeptical that we may do higher than this, on condition that it is already ~16x quicker than what we at the moment know tips on how to do. Although given the magnitude of the unknowns right here, I would not be shocked if we may.

Nevertheless, in my opinion, regardless of how loopy paranoid you’re, this isn’t practical with present strategies, even for a very-long-term chilly pockets.

[*] If you happen to’re tremendous critical about avoiding belief in digital computer systems, you might be upset about signing hashes that a pc makes for you. In any case, the pc may lie about what the hash is, thereby tricking you into signing a transaction you do not intend to. To take care of this, I’d advise asking a number of computer systems to make the identical hash, and possibly even do it on a TI-84 or a Gameboy or one thing which lengthy predates Bitcoin. You *may* compute the hashes by hand however that is many tens of hours of labor.

[**] Possibly you’re fearful that the computations carried out within the lookup desk could also be mistaken or compromised. Effectively, except you’re prepared to make your personal lookup desk tome, which is able to take you a lot many years doing monk-like work in a secret location, you will want to reside with this. There’s extra I may say about this however I’m attempting to maintain this reply centered.

[ad_2]