Age | Commit message (Collapse) | Author |
|
|
|
structure.
When running multiple concurrent unit tests, I observed multiple failures
in the hmac tests, which I ultimately tracked down to different clients
sharing the same hal_hmac_state struct.
hal_hash_initialize is called twice in hal_hmac_initialize (once to get
the state structure, then again if the supplied key is too long), and is
called in hal_hmac_finalize, to hash the digest with the supplied key. In
these subsequent cases, the caller supplies the state structure, which
hal_hash_initialize zeroes, but it doesn't set the allocated flag. This
marks an in-use struct as available, so it gets reassigned and
reinitialized, and Bad Things Happen for both clients that are trying to
use it.
|
|
At least for now, the speed tradeoff between software ModExp and our
Verilog ModExp core differs significantly between signature and key
generation. We don't really know why, but since key generation does
not need to be constant time, we split out control over whether to use
the software or FPGA implementation, so that we can use the FPGA for
signature while using software for key generation.
Revisit this if and when we figure out what the bottleneck is, as well
as any time that the FPGA core itself changes significantly.
|
|
Trying to make RSA key generation run in constant time is probably
both futile and unnecessary, so we can speed it up a bit by switching
the ModExpA7 core to use "fast" mode rather than "constant time" mode.
Sadly, while this change produces a measureable improvement, it
doesn't bring FGPA ModExp anywhere near the speed of the software
equivalent in this case. Don't really know why.
|
|
|
|
Initial version, very basic, RSA-only. Gussy up later.
|
|
|
|
Algorithm suggested by a note in Handbook of Applied Cryptography,
motivated by profiling of libtfm fp_isprime() function showing
something close to 50% of CPU time spent running Montgomery reductions
in the small primes test, before we even get to Miller-Rabin.
|
|
|
|
|
|
|
|
|
|
|
|
Except for torture tests, we never really used the hideously complex
multi-block capabilities of the ksng version of the flash keystore,
among other reasons because the only keys large enough to trigger the
multi-block code were slow enough to constitute torture on their own.
So we can preserve backwards compatabliity simply by including the
former *chunk fields (renamed legacy* here) in the CRC and checking
for the expected single-block key values. We probably want to include
everything in the CRC in any case except when there's an explicit
reason omit something, so, this is cheap, just a bit obscure.
At some point in the future we can phase out support for the backwards
compatible values, but there's no particular hurry about it unless we
want to reuse those fields for some other purpose.
|
|
cryptech_backup is designed to help the user transfer keys from one
Cryptech HSM to another, but what is is a user who has no second HSM
supposed to do for backup? The --soft-backup option enables a mode in
which cryptech_backup generates its own KEKEK instead of getting one
from the (nonexistent) target HSM. We make a best-effort attempt to
keep this soft KEKEK secure, by wrapping it with a symmetric key
derived from a passphrase, using AESKeyWrapWithPadding and PBKDF2,
but there's a limit to what a software-only solution can do here.
The --soft-backup code depends (heavily) on PyCrypto.
|
|
cryptech_backup is designed to help the user transfer keys from one
Cryptech HSM to another, but what is is a user who has no second HSM
supposed to do for backup? The --soft-backup option enables a mode in
which cryptech_backup generates its own KEKEK instead of getting one
from the (nonexistent) target HSM. We make a best-effort attempt to
keep this soft KEKEK secure, by wrapping it with a symmetric key
derived from a passphrase, using AESKeyWrapWithPadding and PBKDF2,
but there's a limit to what a software-only solution can do here.
The --soft-backup code depends (heavily) on PyCrypto.
|
|
We were XORing the low 32 bits of R[0] instead of the full 64 bits.
Makes no difference for small values of n, so we never detected it.
|
|
The HSM itself should be detecting carrier drop on its RPC port, but I
haven't figured out where the DCD bit is hiding in the STM32 UART API,
and the muxd has to be involved in this in any case, since only the
muxd knows when an individual client connection has dropped. So, for
the moment, we handle all of this in the muxd.
|
|
Most keystore methods already followed this rule, but hal_ks_*_init()
and hal_ks_*_logout() were confused, in different ways.
|
|
|
|
The internal keystore API has changed enough since where the "logout"
branch forked that a plain merge would have no prayer of compiling,
must less running. So this merge goes well beyond manual conflict
resolution: it salvages the useful code from the "logout" branch, with
additional code as needed to reimplement the functionality. Sorry.
|
|
|
|
|
|
Cosmetic cleanup of pkey_slot along the way.
|
|
|
|
|
|
|
|
Need to refactor init sequence slightly (again), this time to humor
the bootloader, which has its own special read-only view of the PIN
block in the token keystore.
|
|
Still not yet expected to compile, much less run, but getting closer.
|
|
|
|
|
|
|
|
|
|
The Novena-era mmap()-based keystore is far enough out of date that
it's not worth maintaining (and we haven't been doing so): if we ever
need one again, it would be easier to rewrite it from scratch.
|
|
|
|
|
|
|
|
Support for variable-length keystore objects significantly complicates
the keystore implementation, including serious some serious code bloat
and a complex recovery algorithm to deal with crashes or loss of power
at exactly the wrong time. Perhaps we don't really need this?
So this is an experiment to see whether we can replace variable-length
keystore objects with fixed-length, perhaps with a compile time option
to let us make the fixed object length be 8192 bytes instead of 4096
bytes when needed to hold things like large RSA keys.
First pass on this is just throwing away nearly 1,000 lines of
excessively complex code. The result probably won't even compile yet,
but it's already significantly easier to read.
|
|
|
|
|
|
|
|
|
|
Turns out there are a couple of known minor bugs in PyCrypto's ASN.1
decoder, simple dumb things that never could have worked. Debian's
packaging includes a patch for these bugs, but for some reason the
patch is marked as not needing to be sent upstream, dunno why. So
these methods work fine, but only on Debian. Feh.
Simplest approach is to work around the bugs on all platforms,
particularly given that this is just unit test support code.
|
|
|
|
|
|
|
|
|
|
|
|
Consistent user complaints about HSM login taking too long.
Underlying issue has both superficial and fundamental causes.
Superficial: Our PBKDF2 implementation is slow. We could almost
certainly make it faster by taking advantage of partial
pre-calculation (see notes in code) and by reenabling use of FPGA hash
cores when when checking passwords (which mgiht require linking the
bootloader against a separate libhal build to avoid chicken-and-egg
problem of needing FPGA to log into console to configure FPGA).
Fundamental: The PBKDF2 iteration counts we used to use (10,000
minimum, 20,000 default) are in line with current NIST
recommendations. The new, faster values (1,000 and 2,000,
respectively) are not, or, rather, they're in line with what NIST
recommended a decade ago. Well, OK, maybe the Coretex M4 is so slow
that it's living in the past, but still. The fundamental issue is
that anybody who can capture the encoded PIN can mount an offline
dictionary attack on it, so we'd like to make that expensive.
But the users are unhappy with the current behavior, so this change
falls back to the ancient technique of adding a delay (currently five
seconds, configurable at compile time) after a bad PIN, which makes it
painful to use the login function as an oracle but does nothing about
the offline dictionary attack problem.
Feh.
Note that users can still choose a higher iteration count, by setting
the iteration count via the console. It's just not the default out of
the box anymore.
|
|
|