Age | Commit message (Collapse) | Author |
|
The internal keystore API has changed enough since where the "logout"
branch forked that a plain merge would have no prayer of compiling,
must less running. So this merge goes well beyond manual conflict
resolution: it salvages the useful code from the "logout" branch, with
additional code as needed to reimplement the functionality. Sorry.
|
|
Cosmetic cleanup of pkey_slot along the way.
|
|
Need to refactor init sequence slightly (again), this time to humor
the bootloader, which has its own special read-only view of the PIN
block in the token keystore.
|
|
Still not yet expected to compile, much less run, but getting closer.
|
|
|
|
The Novena-era mmap()-based keystore is far enough out of date that
it's not worth maintaining (and we haven't been doing so): if we ever
need one again, it would be easier to rewrite it from scratch.
|
|
Support for variable-length keystore objects significantly complicates
the keystore implementation, including serious some serious code bloat
and a complex recovery algorithm to deal with crashes or loss of power
at exactly the wrong time. Perhaps we don't really need this?
So this is an experiment to see whether we can replace variable-length
keystore objects with fixed-length, perhaps with a compile time option
to let us make the fixed object length be 8192 bytes instead of 4096
bytes when needed to hold things like large RSA keys.
First pass on this is just throwing away nearly 1,000 lines of
excessively complex code. The result probably won't even compile yet,
but it's already significantly easier to read.
|
|
|
|
Consistent user complaints about HSM login taking too long.
Underlying issue has both superficial and fundamental causes.
Superficial: Our PBKDF2 implementation is slow. We could almost
certainly make it faster by taking advantage of partial
pre-calculation (see notes in code) and by reenabling use of FPGA hash
cores when when checking passwords (which mgiht require linking the
bootloader against a separate libhal build to avoid chicken-and-egg
problem of needing FPGA to log into console to configure FPGA).
Fundamental: The PBKDF2 iteration counts we used to use (10,000
minimum, 20,000 default) are in line with current NIST
recommendations. The new, faster values (1,000 and 2,000,
respectively) are not, or, rather, they're in line with what NIST
recommended a decade ago. Well, OK, maybe the Coretex M4 is so slow
that it's living in the past, but still. The fundamental issue is
that anybody who can capture the encoded PIN can mount an offline
dictionary attack on it, so we'd like to make that expensive.
But the users are unhappy with the current behavior, so this change
falls back to the ancient technique of adding a delay (currently five
seconds, configurable at compile time) after a bad PIN, which makes it
painful to use the login function as an oracle but does nothing about
the offline dictionary attack problem.
Feh.
Note that users can still choose a higher iteration count, by setting
the iteration count via the console. It's just not the default out of
the box anymore.
|
|
|
|
What I get for writing code while build and test environment is tied
up with a multi-day run testing something else.
|
|
|
|
|
|
Static code analysis (Doxygen call graph) detected a low-probability
race condition which could have triggered a deadlock on the keystore
mutex if the mkmif code returns with an error like HAL_ERROR_CORE_BUSY
when we're trying to fetch the KEK.
This is a knock-on effect of the awful kludge of backing up the KEK in
the keystore flash as an alternative to powering the MKM with a
battery as called for in the design. This code path should not exist
at all, but, for now, we avoid the deadlock by making it the caller's
responsibility to grab the keystore mutex before looking up the KEK.
|
|
|
|
pkey_open() now looks in both keystores rather than requiring the user
to know. The chance of collision with randomly-generated UUID is low
enough that we really ought to be able to present a single namespace.
So now we do.
pkey_match() now takes a couple of extra arguments which allow a
single search to cover both keystores, as well as matching for
specific key flags. The former interface was pretty much useless for
anything involving flags, and required the user to issue a separate
call for each keystore.
User wheel is now exempt from the per-session key lookup constraints,
Whether this is a good idea or not is an interesting question, but the
whole PKCS #11 derived per-session key thing is weird to begin with,
and having keystore listings on the console deliberately ignore
session keys was just too confusing.
|
|
Now that we use PKCS #8 format for private keys, all key formats we
use include ASN.1 AlgorithmIdentifier field describing the key, so
specifying key type and curve as arguments to hal_rpc_pkey_load() is
neither necessary nor particularly useful.
|
|
Still missing Python script to drive backup process, and need to do
something about setting the EXPORTABLE key flag for this to be useful.
|
|
|
|
|
|
Renamed multiplexer to cryptech_muxd, since it now handles both RPC and CTY.
Added new program cryptech_console to act as client for CTY multiplexer.
Might want to add console logging capability eventually, not today.
Probably want to incorporate UART probing (what cryptech_probe does
now) eventually, also not today.
|
|
|
|
|
|
PKCS #11 supports zero-length attributes (eg, CKA_LABEL) so hack of
using zero length attribute as NIL value won't work, instead we use a
slightly more portable version of the hack PKCS #11 uses (PKCS #11
stuffs -1 into a CK_ULONG, we stuff 0xFFFFFFFF into a uint32_t).
ks_attribute.c code was trying too hard and tripping over its own
socks. Instead of trying to maintain attributes[] in place during
modification, we now perform the minimum necessary change then re-scan
the block. This is (very slightly) slower but more robust, both
because the scan code has better error checking and because it's the
scan code that we want to be sure is happy before committing a change.
Rename hal_rpc_pkey_attribute_t to hal_pkey_attribute_t.
|
|
|
|
Calling hal_rpc_pkey_get_attributes() with attribute_buffer_len = 0
now changes the return behavior so that it reports the lengths of
attributes listed in the query, with a length of zero for attributes
not present at all. This is mostly to support C_GetAttributeValue()
in PKCS #11, but we also use it to make the Python interface a bit
kinder to the user.
|
|
Wiping the keystore flash requires reinitializing the keystore, but we
don't want to allocate new static memory when we do this.
|
|
hal_rpc_pkey_list() was a simplistic solution that worked when the
keystore only supported a handful of keys and we needed a quick
temporary solution in time for a workshop. It doesn't handle large
numbers of keys well, and while we could fix that, all of its
functionality is now available via more robust API functions, so
simplifying the API by deleting it seems best.
Since this change required mucking with dispatch vectors yet again, it
converts them to use C99 "designated initializer" syntax.
|
|
|
|
This check made sense when attribute deletion was a separate
operation, but now that it has been folded into set_attributes(), this
check was worse than useless.
|
|
pkey attribute API is now just set_attributes() and get_attributes().
|
|
Passes minimal unit-testing and the same minimal tests report that it
does deliver the desired performance speed-up. More testing and much
cleanup still needed.
Attribute API not quite stable yet, we're probably going to want to
remove all the singleton attribute operations from the RPC protocol,
and it turns out that ks_delete_attributes() has enough code in common
with ks_set_attributes() that it makes more sense to handle the former
as a special case of the latter.
|
|
This is not yet complete, only the ks_volatile driver supports it,
ks_flash will be a bit more complicated and isn't written yet.
At the moment, this adds a complete duplicate set of
{set,get,delete}_attributes() functions in parallel to the earlier
{set,get,delete}_attribute() functions. We will almost certainly want
to get rid of the duplicates, probably (but not necessarily) the
entire single-attribute suite. At the moment, though, we want both
sets so we can compare execution speeds of the two sets of functions.
|
|
Incidental minor refactoring of hal_rpc_server_dispatch().
|
|
The debugging code was for tracking down what turned out to be a race
condition in the Alpha's flash driver code (see sw/stm32); much of
this was temporary, and will be removed in a (near) future commit, but
some of the techniques were useful and belong in the repository in
case we need to pull them back for something similar in the future.
hal_ks_index_fsck() attempts to diagnose all the things I found wrong
in the ks_flash index after one long series of errors. As presently
written, it doesn't attempt to fix anything, just diagnose errors: the
intent is that we can call this, before and after every modification
if necessary, to poinpoint exactly which calls introduce errors. Once
things stablize a bit, we may want to crank down the number of calls
to this (it's a bit expensive, since it checks the entire index), and
perhaps add the ability to clean up whatever errors it might find; the
latter might be a good candidate for a CLI command.
|
|
In retrospect it's obvious that this never needed to be an
input/output argument, as its value will always be the same as the
last value in the returned array. Doh. So simplify the RPC and call
sequence slightly by removing the unnecessary output value.
|
|
Passes PKCS #11 "make test" but nothing uses the new attribute code yet.
Refactored some of the flash block update code.
Attribute code is annoyingly verbose, might be possible to refactor
some of that.
|
|
Mostly this is another checkpoint (still passes PKCS #11 "make test").
ks_volatile.c now contains support for per-session object visibility;
this may need more work to support things like a CLI view of all
objects regardless of session. Adding this required minor changes to
the keystore and pkey APIs, mostly because sessions are per-client.
ks_volatile.c also contains an untested first cut at attribute
support. Attribute support in ks_flash.c still under construction.
|
|
RPC calls which pass a pkey handle don't need to pass a session
handle, because the session handle is already in the HSM's pkey slot
object; pkey RPC calls which don't pass a pkey argument do need to
pass a session handle.
This change percolates down to the keystore driver, because only the
keystore driver knows whether that particular keystore cares about
session handles.
|
|
This is mostly to archive a commit where PKCS #11 "make test" still
works after converting the ks_volatile code to use SDRAM allocated at
startup instead of (large) static variables.
The attribute code itself is incomplete at this point.
|
|
The main reason for supporting multi-block objects is to allow the
PKCS #11 code to attach more attributes than will fit comfortably in a
single flash block. This may turn out to be unnecessary once we've
fleshed out the attribute storage and retrieval code; if so, we can
simplify the code, but this way the keystore won't impose arbitrary
(and somewhat inscrutable) size limits on PKCS #11 attributes for
large keys.
This snapshot passes light testing (PKCS #11 "make test" runs), but
the tombstone recovery code in ks_init() is a bit involved, and needs
more testing with simulated failures (probably induced under GDB).
|
|
|
|
|
|
|
|
|
|
Whack masterkey code to meet libhal coding standards, such as they
are.
Started layout of new ks_flash data structures but no changes to
functions or flash usage yet.
MKM initialization from flash placed under compile-time conditional
with warning because it's a dangerous kludge that should go away.
Started getting rid of obsolete keystore code; ks_mmap.c kept for now,
until I get around to merging the useful bits into ks_volatile.
|
|
This is an open source C99 CRC-32 implementation generated by pycrc,
see notes in source on copyright status and pycrc options used.
crc32.c contains two different implementations of the CRC-32 algorithm
with the same API, one optimized for speed, the other optimized for
much smaller code space at the expense of speed. We use the fast
implementation by default, but maybe the small implementation will be
useful, eg, in the bootloader. Remove the extra later if this turns
out to have been a waste of time.
|
|
|
|
|
|
Now that key names are UUIDs generated by the HSM, there's no real
need to specify type key type when looking up a key, and removing the
`type` argument allows a few simplifications of both the internal
keystore API and of client code calling the public RPC API.
|