Age | Commit message (Collapse) | Author |
|
cryptech_backup is designed to help the user transfer keys from one
Cryptech HSM to another, but what is is a user who has no second HSM
supposed to do for backup? The --soft-backup option enables a mode in
which cryptech_backup generates its own KEKEK instead of getting one
from the (nonexistent) target HSM. We make a best-effort attempt to
keep this soft KEKEK secure, by wrapping it with a symmetric key
derived from a passphrase, using AESKeyWrapWithPadding and PBKDF2,
but there's a limit to what a software-only solution can do here.
The --soft-backup code depends (heavily) on PyCrypto.
|
|
cryptech_backup is designed to help the user transfer keys from one
Cryptech HSM to another, but what is is a user who has no second HSM
supposed to do for backup? The --soft-backup option enables a mode in
which cryptech_backup generates its own KEKEK instead of getting one
from the (nonexistent) target HSM. We make a best-effort attempt to
keep this soft KEKEK secure, by wrapping it with a symmetric key
derived from a passphrase, using AESKeyWrapWithPadding and PBKDF2,
but there's a limit to what a software-only solution can do here.
The --soft-backup code depends (heavily) on PyCrypto.
|
|
We were XORing the low 32 bits of R[0] instead of the full 64 bits.
Makes no difference for small values of n, so we never detected it.
|
|
The HSM itself should be detecting carrier drop on its RPC port, but I
haven't figured out where the DCD bit is hiding in the STM32 UART API,
and the muxd has to be involved in this in any case, since only the
muxd knows when an individual client connection has dropped. So, for
the moment, we handle all of this in the muxd.
|
|
Most keystore methods already followed this rule, but hal_ks_*_init()
and hal_ks_*_logout() were confused, in different ways.
|
|
|
|
The internal keystore API has changed enough since where the "logout"
branch forked that a plain merge would have no prayer of compiling,
must less running. So this merge goes well beyond manual conflict
resolution: it salvages the useful code from the "logout" branch, with
additional code as needed to reimplement the functionality. Sorry.
|
|
|
|
|
|
Cosmetic cleanup of pkey_slot along the way.
|
|
|
|
|
|
|
|
Need to refactor init sequence slightly (again), this time to humor
the bootloader, which has its own special read-only view of the PIN
block in the token keystore.
|
|
Still not yet expected to compile, much less run, but getting closer.
|
|
|
|
|
|
|
|
|
|
|
|
The Novena-era mmap()-based keystore is far enough out of date that
it's not worth maintaining (and we haven't been doing so): if we ever
need one again, it would be easier to rewrite it from scratch.
|
|
|
|
|
|
|
|
Support for variable-length keystore objects significantly complicates
the keystore implementation, including serious some serious code bloat
and a complex recovery algorithm to deal with crashes or loss of power
at exactly the wrong time. Perhaps we don't really need this?
So this is an experiment to see whether we can replace variable-length
keystore objects with fixed-length, perhaps with a compile time option
to let us make the fixed object length be 8192 bytes instead of 4096
bytes when needed to hold things like large RSA keys.
First pass on this is just throwing away nearly 1,000 lines of
excessively complex code. The result probably won't even compile yet,
but it's already significantly easier to read.
|
|
|
|
|
|
|
|
|
|
Turns out there are a couple of known minor bugs in PyCrypto's ASN.1
decoder, simple dumb things that never could have worked. Debian's
packaging includes a patch for these bugs, but for some reason the
patch is marked as not needing to be sent upstream, dunno why. So
these methods work fine, but only on Debian. Feh.
Simplest approach is to work around the bugs on all platforms,
particularly given that this is just unit test support code.
|
|
|
|
|
|
|
|
|
|
|
|
Consistent user complaints about HSM login taking too long.
Underlying issue has both superficial and fundamental causes.
Superficial: Our PBKDF2 implementation is slow. We could almost
certainly make it faster by taking advantage of partial
pre-calculation (see notes in code) and by reenabling use of FPGA hash
cores when when checking passwords (which mgiht require linking the
bootloader against a separate libhal build to avoid chicken-and-egg
problem of needing FPGA to log into console to configure FPGA).
Fundamental: The PBKDF2 iteration counts we used to use (10,000
minimum, 20,000 default) are in line with current NIST
recommendations. The new, faster values (1,000 and 2,000,
respectively) are not, or, rather, they're in line with what NIST
recommended a decade ago. Well, OK, maybe the Coretex M4 is so slow
that it's living in the past, but still. The fundamental issue is
that anybody who can capture the encoded PIN can mount an offline
dictionary attack on it, so we'd like to make that expensive.
But the users are unhappy with the current behavior, so this change
falls back to the ancient technique of adding a delay (currently five
seconds, configurable at compile time) after a bad PIN, which makes it
painful to use the login function as an oracle but does nothing about
the offline dictionary attack problem.
Feh.
Note that users can still choose a higher iteration count, by setting
the iteration count via the console. It's just not the default out of
the box anymore.
|
|
|
|
|
|
What I get for writing code while build and test environment is tied
up with a multi-day run testing something else.
|
|
|
|
|
|
I doubt this change will have any noticable effect, but it's another
theoretical race condition, might as well eliminate it.
|
|
Static code analysis (Doxygen call graph) detected a low-probability
race condition which could have triggered a deadlock on the keystore
mutex if the mkmif code returns with an error like HAL_ERROR_CORE_BUSY
when we're trying to fetch the KEK.
This is a knock-on effect of the awful kludge of backing up the KEK in
the keystore flash as an alternative to powering the MKM with a
battery as called for in the design. This code path should not exist
at all, but, for now, we avoid the deadlock by making it the caller's
responsibility to grab the keystore mutex before looking up the KEK.
|
|
|
|
|
|
|
|
success, but reduces the failure rate on a busy server.
|
|
|
|
conditions.
This manifested as hal_aes_keyunwrap() returning HAL_ERROR_CORE_BUSY, but
getting reported as HAL_OK, which led to HAL_ERROR_ASN1_PARSE_FAILED
when trying to parse the not-unwrapped der.
|
|
We're using non-blocking I/O in any case, might as well take advantage
of it to keep console output a little smoother.
|