Age | Commit message (Collapse) | Author |
|
General technique here suggested by Peter Gutman. If I got the math
wrong, blame me, not Peter.
This has not yet been tested to confirm that it returns correct
results when using the blinding factors cache, and preliminary timing
results suggest that we may be chasing the wrong performance problem.
Unclear whether we'll ever really want to integrate this change, but
pushing it on a branch to get it into repository history.
If we do end up using this, the blinding factors cache will need minor
redesign, principally to use the external SDRAM because main memory
has gotten kind of full. Some way to clear the cache when restarting
the HSM would be nice, probably requires a `hal_rsa_init()` function.
|
|
|
|
Copy ContextManagedUnpacker from latest version of libhal.py so that
this script won't depend on the current development code.
|
|
At the moment this only handles RSA keys, and can only handle one size
of key at a time. More bells and whistles will follow eventually,
now that the basic asynchronous API to our RPC protocol works.
|
|
Failing to clear the temporary buffer used to transfer bits from the
TRNG into a bignum was a real leak of something very close to keying
material, albeit only onto the local stack where it was almost certain
to have been overwritten by subsequent operations (generation of other
key components, wrap and PKCS #8 encoding) before pkey_generate_rsa()
ever returned to its caller. Still, bad coder, no biscuit.
Failing to clear the remainders array was probably harmless, but
doctrine says clear it anyway.
|
|
contextlib is cute, but incompatible with other coroutine schemes like
Tornado, so just write our own context manager for xdrlib.Unpacker.
|
|
Uncoordinated attempts to allocate two modexpa7 cores leads to deadlock if
multiple clients try to do concurrent RSA signing operations.
The simplest solution (back off and retry) could theoretically lead to
resource starvation, but we haven't seen it in actual testing.
|
|
This branch was sitting for long enough that master had been through a
cleanup pass, so beware of accidental reversions.
|
|
|
|
|
|
|
|
values.
|
|
|
|
|
|
|
|
|
|
Snapshot of mostly but not entirely working code to include the extra
ModExpA7 key components in the keystore. Need to investigate whether
a more compact representation is practical for these components, as
the current one bloats the key object so much that a bare 4096-bit key
won't fit in a single hash block, and there may not be enough room for
PKCS #11 attributes even for smaller keys.
If more compact representation not possible or insufficient, the other
option is to double the size of a keystore object, making it two flash
subsectors for a total of 8192 octets. Which would of course halve
the number of keys we can store and require a bunch of little tweaks
all through the ks code (particularly flash erase), so definitely
worth trying for a more compact representation first.
|
|
|
|
|
|
|
|
|
|
|
|
Work in progress. Probably won't even compile, much less run.
Requires corresponding new core/math/modexpa7 core.
No support (yet) for ASN.1 encoding of speedup factors or storage of
same in keystore.
No support (yet) for running CRT algorithm in parallel cores.
Minor cleanup of ancient bus I/O code, including EIM and I2C bus code
we'll probably never use again.
|
|
structure.
When running multiple concurrent unit tests, I observed multiple failures
in the hmac tests, which I ultimately tracked down to different clients
sharing the same hal_hmac_state struct.
hal_hash_initialize is called twice in hal_hmac_initialize (once to get
the state structure, then again if the supplied key is too long), and is
called in hal_hmac_finalize, to hash the digest with the supplied key. In
these subsequent cases, the caller supplies the state structure, which
hal_hash_initialize zeroes, but it doesn't set the allocated flag. This
marks an in-use struct as available, so it gets reassigned and
reinitialized, and Bad Things Happen for both clients that are trying to
use it.
|
|
At least for now, the speed tradeoff between software ModExp and our
Verilog ModExp core differs significantly between signature and key
generation. We don't really know why, but since key generation does
not need to be constant time, we split out control over whether to use
the software or FPGA implementation, so that we can use the FPGA for
signature while using software for key generation.
Revisit this if and when we figure out what the bottleneck is, as well
as any time that the FPGA core itself changes significantly.
|
|
Trying to make RSA key generation run in constant time is probably
both futile and unnecessary, so we can speed it up a bit by switching
the ModExpA7 core to use "fast" mode rather than "constant time" mode.
Sadly, while this change produces a measureable improvement, it
doesn't bring FGPA ModExp anywhere near the speed of the software
equivalent in this case. Don't really know why.
|
|
|
|
Initial version, very basic, RSA-only. Gussy up later.
|
|
|
|
Algorithm suggested by a note in Handbook of Applied Cryptography,
motivated by profiling of libtfm fp_isprime() function showing
something close to 50% of CPU time spent running Montgomery reductions
in the small primes test, before we even get to Miller-Rabin.
|
|
|
|
|
|
|
|
|
|
|
|
Except for torture tests, we never really used the hideously complex
multi-block capabilities of the ksng version of the flash keystore,
among other reasons because the only keys large enough to trigger the
multi-block code were slow enough to constitute torture on their own.
So we can preserve backwards compatabliity simply by including the
former *chunk fields (renamed legacy* here) in the CRC and checking
for the expected single-block key values. We probably want to include
everything in the CRC in any case except when there's an explicit
reason omit something, so, this is cheap, just a bit obscure.
At some point in the future we can phase out support for the backwards
compatible values, but there's no particular hurry about it unless we
want to reuse those fields for some other purpose.
|
|
cryptech_backup is designed to help the user transfer keys from one
Cryptech HSM to another, but what is is a user who has no second HSM
supposed to do for backup? The --soft-backup option enables a mode in
which cryptech_backup generates its own KEKEK instead of getting one
from the (nonexistent) target HSM. We make a best-effort attempt to
keep this soft KEKEK secure, by wrapping it with a symmetric key
derived from a passphrase, using AESKeyWrapWithPadding and PBKDF2,
but there's a limit to what a software-only solution can do here.
The --soft-backup code depends (heavily) on PyCrypto.
|
|
cryptech_backup is designed to help the user transfer keys from one
Cryptech HSM to another, but what is is a user who has no second HSM
supposed to do for backup? The --soft-backup option enables a mode in
which cryptech_backup generates its own KEKEK instead of getting one
from the (nonexistent) target HSM. We make a best-effort attempt to
keep this soft KEKEK secure, by wrapping it with a symmetric key
derived from a passphrase, using AESKeyWrapWithPadding and PBKDF2,
but there's a limit to what a software-only solution can do here.
The --soft-backup code depends (heavily) on PyCrypto.
|
|
We were XORing the low 32 bits of R[0] instead of the full 64 bits.
Makes no difference for small values of n, so we never detected it.
|
|
The HSM itself should be detecting carrier drop on its RPC port, but I
haven't figured out where the DCD bit is hiding in the STM32 UART API,
and the muxd has to be involved in this in any case, since only the
muxd knows when an individual client connection has dropped. So, for
the moment, we handle all of this in the muxd.
|
|
Most keystore methods already followed this rule, but hal_ks_*_init()
and hal_ks_*_logout() were confused, in different ways.
|
|
|
|
The internal keystore API has changed enough since where the "logout"
branch forked that a plain merge would have no prayer of compiling,
must less running. So this merge goes well beyond manual conflict
resolution: it salvages the useful code from the "logout" branch, with
additional code as needed to reimplement the functionality. Sorry.
|
|
|
|
|
|
Cosmetic cleanup of pkey_slot along the way.
|
|
|
|
|
|
|
|
Need to refactor init sequence slightly (again), this time to humor
the bootloader, which has its own special read-only view of the PIN
block in the token keystore.
|