prng change feedback

historical

On 2017-10-06 01:23, Harald Welte wrote:
> Hi RS,
> 
>> After running the above tests and reading the related documentation, my
>> conclusion is that it would be reasonable to use syscall(SYS_getrandom,
>> &buf, bufsize, 0) as a suitable replacement for rand() in all cases
>> without any concrete security or performance concerns. The overhead is
>> also significantly less than importing or otherwise depending on
>> OpenSSL, GnuTLS,  NaCL, or probably even glibc.
> 
> Thanks again for your investigation.

You're welcome. Thanks for writing Free Software! I really appreciate
that your project is open to feedback from complete strangers!

My program is not optimized in any sense and I re-ran my program for 0
to -1 iterations with the following outcome:

 time ./getrandom-exhaust
 getrandom status - byte size requested: 512 iteration: -1
 Current entropy available: 2575
 Original entropy estimate: 3338

 real	170m44.814s
 user	3m50.836s
 sys	166m52.948s

I think it would be interesting to run this on any supported system and
see if the output is similar. All of the conclusions that I've reached
are based on the notion that any decreases to the entropy pool are
negligible. Furthermore that the entropy pool might increase during the
attempted exhaustion process which is probably an unrelated systems side
effect. In either case, it will never underflow and it will always
return the requested number of bytes.

> So the conclusion is probably, for now:
> * use getrandom() with zero flags for TMSIs and other random identifiers
> * use getrandom() with zero flags for RAND challenges
> * use getrandom() with GRND_RANDOM flag for K/OP/OPc/Ki generation
> 

I don't entirely disagree, however the third option creates a more
complex case that I'm not sure is worth the trouble. I would very
strongly resist the urge to believe that the raw device bits are
_better_ than the output of the PRNG. The GRND_RANDOM flag will result
in underflows of the entire systems entropy pool, which especially on an
embedded system is a halt-and-catch-fire situation. The complexity
around that failure mode might cause failures that are otherwise avoided
entirely.

>> It may make sense to use the platform's libc interface if it is available.
> 
> I would go for that, and I believe the current patch under discussion is doing
> exactly that: use getrandom() if available, and fall back to the raw syscall,
> if not.
> 

I'd suggest someone examine the failure modes for glibc before making
this decision. I expect it to be a completely sane interface which fails
nicely - though, if it doesn't, we need to consider how it fails.
Apparently it was a "long road" [0] [1] to supporting getrandom() in
glibc. The 2.25 appears to be the first release to support getrandom().
Probably the most interesting thing of note for that version is "* The
getentropy and getrandom functions, and the <sys/random.h> header file
have been added." This almost suggests to me that getentropy() and a
maximum buffer of 256 bytes is a reasonable interface to settle on.

As further reading - the people implementing Rust's random [3] have the
following view of the matter:

    Unix-like systems (Linux, Android, Mac OSX): read directly from
/dev/urandom, or from getrandom(2) system call if available.
    OpenBSD: calls getentropy(2)
    FreeBSD: uses the kern.arandom sysctl(2) mib
    Windows: calls RtlGenRandom, exported from advapi32.dll as
SystemFunction036.
    iOS: calls SecRandomCopyBytes as /dev/(u)random is sandboxed.
    PNaCl: calls into the nacl-irt-random-0.1 IRT interface.

> Further falling back on rand() is probably a bad idea, we could simply
> make this a fatal runtime error ending the program and see if anyone
> ever reports that error to the mailing list.  If at all, the fall-back
> should not happen automatically, but should be explicitly requested at
> compile-time ("--enable-unsafe-rand"), or depend on an environment
> variable, or the like.  Basically to make sure the default configuration
> will be safe, and any unsafe operation requires explicit user
> intervention.
> 
>> It may also be worthwhile to try to ensure that buffer is indeed
>> changed.
> 
> good point, I like that.

I'm not entirely sure of the best method for checking but I would expect
something like this to at least fail safely in the event of a kernel
failure:

 - allocate a fixed size buffer of 256 bytes which seems to be the most
portable size among all the OS provided interfaces
 - memset the buffer with a known value such as #define
THEMEASUREOFABYTE 0x47
 - memcmp each byte of the buffer with THEMEASUREOFABYTE to ensure that
each byte compared is the same
 - request that the OS or glibc or libc fills that same buffer with
random bytes for the size of the buffer
 - ensure that the number of bytes written is equal to the requested
size
 - memcmp each byte of the buffer with THEMEASUREOFABYTE to ensure that
each byte compared is *not* the same
 - if anything has gone wrong, return -1 and 0 (is this correct?)
 - in one possible future, hash the buffer in place to conceal the raw
outputs from the PRNG
 - return the number of bytes written to the buffer, return the buffer
full of random bytes

That may make a better unit test than an actual wrapper function. It
seems that for portability, it's worth considering such a wrapper as it
doesn't appear that every API has the same failure mode. There is of
course the unfortunate side effect of removing all 0x47s from your
buffer with such an approach, which leads to a suggestion of generating
a single random byte, which well, it's 🐢🐢🐢 all the way down... :-)

Such an implementation should be constant time and take into
consideration other things that might leak the random data. I'm not sure
if memcmp is ideal for that reason. I'll do some additional
research/experimentation and share the results.

>> The small program below could also easily be modified to test that the
>> buffer is indeed completely filled with some new data and to
>> additionally hash the buffer before use in any cryptographic
>> application.
> 
> Yes, we could improve on that by using some hashing function on the
> result, rather than using the (cs)prng output directly.  But let's keep
> that as a possible future improvement for now.  Out of curiosity: What
> would you recommend?

For a 512 *bit* buffer, I think hashing with SHA512 would be a fine
transformation. If the attacker can guess the input for an output of
SHA512, there is a serious problem. In which case, the hashing might
mitigate practical, casual exploitation of the problem. It might not, of
course.

For buffers that are 256 or 512 bytes, I think it could be hashed in two
or four chunks but someone smarter than me should think about that
strategy before drawing a final conclusion. There are strategies using
block ciphers rather than hashing functions that might make sense for
larger buffers. Such a block cipher would essentially become a user
space RNG of a similar design to the one from djb that I shared in
earlier in the thread. Once we're in block cipher territory, we're using
additional memory, we increase complexity, we may also have concerns
about internal key schedules, side channels, and so on. I suppose this
is another open research question for me. I'm sure someone else knows an
answer off hand but I'm not confident that I know how to do it
correctly.

Happy Hacking,
RS

[0] https://lwn.net/Articles/711013/
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=17252
[2] https://doc.rust-lang.org/rand/rand/os/struct.OsRng.html