Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NULL Checks and Error Handling #527

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

songlingatpan
Copy link

This PR primarily addresses potential NULL pointer dereference issues, enhances error handling, and resolves possible memory leaks in failure scenarios.

There are no new features or behavior changes introduced in this PR; the modifications are solely focused on improving stability.

@songlingatpan
Copy link
Author

@baentsch This PR is ready for review. Thanks

@@ -139,13 +142,6 @@ static int oqsx_match(const void *keydata1, const void *keydata2,
}

#ifdef NOPUBKEY_IN_PRIVKEY
/* Now this is a "leap of faith" logic: If a public-only PKEY and a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you suggest dropping this comment @songlingatpan ?

@@ -178,15 +174,13 @@ static int oqsx_match(const void *keydata1, const void *keydata2,
(key1->pubkey != NULL && key2->pubkey == NULL) ||
((key1->tls_name != NULL && key2->tls_name != NULL) &&
strcmp(key1->tls_name, key2->tls_name))) {
// special case now: If domain parameter matching
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here -- why remove comment?

int ok = 1;

OQS_KM_PRINTF("OQSKEYMGMT: export called\n");

/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removed?

if (params == NULL) {
ok = 0;
goto err;
if (ok) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not also asserting tmpl!=NULL?

}
// not passing in params is no error; subsequent operations may fail,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why deleted?


err_init_ecx:
return ret;
}

/* Re-create OQSX_KEY from encoding(s): Same end-state as after ken-gen */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please bring back comment as it aids understanding.

Copy link
Member

@baentsch baentsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the substantial amount of work, @songlingatpan ! This clearly improves the code resilience. Due to the size of the PR I had to stop in the middle of the review but would like to ask you to bring back the comments deleted (or explain why you deleted them) as well as provide an explanation what you did to the composite logic so I can better follow all changes there. Thanks!

@songlingatpan
Copy link
Author

Thanks for the substantial amount of work, @songlingatpan ! This clearly improves the code resilience. Due to the size of the PR I had to stop in the middle of the review but would like to ask you to bring back the comments deleted (or explain why you deleted them) as well as provide an explanation what you did to the composite logic so I can better follow all changes there. Thanks!

Sorry @baentsch. I will add comments back.
The oqsx_key_op function was quite large, so I refactored it into smaller, more functions such as allocate_and_copy_keymaterial, oqsx_key_process_composite_keys, and process_private_key. However, the logic remains unchanged.

Thanks a lot again for your incredible work on the OQS provider.

Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, @songlingatpan! This is a substantial chunk of work.

Before I get too far into the review, I wanted to ask something. There seem to be a great number of NULL checks added. Are all of these necessary? From a code hygiene perspective, I feel that it's better to have clearly documented guarantees about NULL handling (which functions MUST be null-checked, which functions handle NULL gracefully) and then rely on those guarantees instead of including redundant checks. IMO, this makes it easier to reason about code correctness and safety.

It would also be good to hear @baentsch's perspective on this as the primary developer and maintainer of the codebase.

if (pkemctx == NULL || vkem == NULL || !oqsx_key_up_ref(vkem))
return 0;
oqsx_key_free(pkemctx->kem);
if (pkemctx->kem != NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing technically wrong with putting this guard here, but it's not necessary—oqsx_key_free handles null pointers gracefully, as (in my opinion) any "good" freeing function should. I would vote to remove the guard and add a comment above the oqsx_key_free function stating that it must handle NULL.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments.
We see quite some crashes due to dereference NULL pointer in our load testing. That is why we aggressively add NULL checks.
I will remove unnecessary NULL checks.

OPENSSL_free(pkemctx);
if (pkemctx != NULL) {
oqsx_key_free(pkemctx->kem);
OPENSSL_free(pkemctx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as below: OPENSSL_free handles NULL gracefully, and imo it's cleaner to not perform the check.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NULL check is for prevent from NULL pointer for oqsx_key_free(pkemctx->kem), not for OPENSSL_free/free().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I should have been more clear: I meant to write that the OPENSSL_free call could be moved outside the conditional block, not that the check itself could be removed entirely. Certainly it's still necessary before the dereference.

@baentsch
Copy link
Member

this makes it easier to reason about code correctness and safety.

It would also be good to hear @baentsch's perspective on this as the primary developer and maintainer of the codebase.

I agree -- the ...free.... functions typically handle NULL objects gracefully. And I also agree, too many safety checks make the code hard to read.

But then again, there's lots of other code that's harder to read/understand/maintain than these added checks.

Maybe using macros might be a compromise?

Signed-off-by: Songling Han <[email protected]>
@songlingatpan
Copy link
Author

this makes it easier to reason about code correctness and safety.

It would also be good to hear @baentsch's perspective on this as the primary developer and maintainer of the codebase.

I agree -- the ...free.... functions typically handle NULL objects gracefully. And I also agree, too many safety checks make the code hard to read.

But then again, there's lots of other code that's harder to read/understand/maintain than these added checks.

Maybe using macros might be a compromise?

Thanks for the comments.
We see quite some crashes due to dereference NULL pointer in our load testing. That is why we aggressively add NULL checks.
I will remove unnecessary NULL checks.

@songlingatpan
Copy link
Author

@baentsch @SWilson4
When we encountered crashes, we reviewed the related functions and added NULL checks. I agree that, in some cases, we may have been overly cautious with these checks. The reason for this approach is that we were consistently seeing crashes in various places, so we aimed to go through the files one by one and add protective code.

I will go ahead and remove the unnecessary NULL checks.

Thank you again for the excellent work you've done on building the OQS code.

`#0 0x00007ffff67c20f2 in evp_md_init_internal (ctx=0x0, type=0xd4661daf00, params=0x0, impl=0x0) at crypto/evp/digest.c:166

#1 0x00007ffff67c26ee in EVP_DigestInit_ex (ctx=, type=, impl=) at crypto/evp/digest.c:385

#2 0x00007ffff1a9ac63 in do_hash (md=0xd4661daf00, inplen=64, input=0x7fffdfd8cde0 "s\315~jQ\261\224 \032\025\346\243\201\354\340\240\004:*7\020o`e\177\326d\263Z\215\251\253\225E\002\361\340L\305\b\334\nD;\365d5\004\005\066\302\060\264\024\060<\n\331", output=0x7fffdfd8ce20 "") at ../src/common/sha3/ossl_sha3.c:22

#3 OQS_SHA3_sha3_512 (output=0x7fffdfd8ce20 "", input=0x7fffdfd8cde0 "s\315~jQ\261\224 \032\025\346\243\201\354\340\240\004:*7\020o`e\177\326d\263Z\215\251\253\225E\002\361\340L\305\b\334\nD;\365d5\004\005\066\302\060\264\024\060<\n\331", inplen=64) at ../src/common/sha3/ossl_sha3.c:110

#4 0x00007ffff1936a32 in pqcrystals_kyber1024_ref_dec (ss=0xd46bc1fa40 "", ct=0xd46970715a "\362,\325\005\240\317_NTC\267F3\233\325\234k\020\034\255\253\277\032\355\244\302^F\v*\312\334\315B\030\232\263\327j-\270\233~\236\352\304\260G'\016\252\217\274/}\275\r\ad\321!e5\305\356/\304\345oB \v\271\206\324\355/\346\211\302\236Z\337\311\230\243\334\370\314\246\354zD4\327{\373\063\315\366\372\347\233\233 \232\031\215<w\321\267/\031<4\021\226\065\360\242\226\240\347\063\267E\241I\301R\213\201\224\364/\354+\027\017\033\227\224Q"\327\311\233_\300\217\315H\343\370\273\346N\337\226=\r\305c\274\347t\332U\036\331\200\370l\324MvAi\365\356E\005\243\025mLK\277x4W\371i\235t4\211"..., sk=0xd46630c400 "l%G\342

{eD\232\062\363dh@\003V\213\311\244\375\303N4D\236/[\\020)\212\240\310\363\200=\222V\a\304su\003\206\237|s^)\313-\n\315\334v5Q\262\060\200\273i\322\227}
r`

`#0 OQS_SHA3_shake128_inc_squeeze (output=output@entry=0x7fffe1cb4590 "", outlen=outlen@entry=504, state=state@entry=0x7fffe1cb4588) at ../src/common/sha3/ossl_sha3.c:219

#1 0x00007ffff19362d1 in pqcrystals_kyber1024_ref_gen_matrix (a=a@entry=0x7fffe1cb5ff0, seed=seed@entry=0x7fffe1cb7ff0 "3nQ\371\341\231\063\271\067\305\370\240\264\306yTz\373\20

6O\353Z\251\030\001\t\006\255\376\210v!", transposed=transposed@entry=0) at ../src/kem/kyber/pqcrystals-kyber_kyber1024_ref/indcpa.c:179

#2 0x00007ffff1936454 in pqcrystals_kyber1024_ref_indcpa_keypair (pk=pk@entry=0xd46821a200 "", sk=sk@entry=0xd468509d00 "") at ../src/kem/kyber/pqcrystals-kyber_kyber1024_ref/in

dcpa.c:220

#3 0x00007ffff1936885 in pqcrystals_kyber1024_ref_keypair (pk=0xd46821a200 "", sk=0xd468509d00 "") at ../src/kem/kyber/pqcrystals-kyber_kyber1024_ref/kem.c:27

#4 0x00007ffff1f0afa5 in oqsx_key_gen_oqs (gen_kem=1, key=0xd036b634c0) at ../oqsprov/oqsprov_keys.c:1735

#5 oqsx_key_gen_oqs (gen_kem=1, key=0xd036b634c0) at ../oqsprov/oqsprov_keys.c:1732

#6 oqsx_key_gen (key=key@entry=0xd036b634c0) at ../oqsprov/oqsprov_keys.c:1858

#7 0x00007ffff1f0e85d in oqsx_genkey (gctx=0xd467fcc9c0) at ../oqsprov/oqs_kmgmt.c:613

`

@baentsch
Copy link
Member

baentsch commented Oct 2, 2024

@songlingatpan Thanks again for the diligent addition of safeguards to the code. However, I'm not sure I understand your latest comment: What should we do with these line items? Please also re-base to latest main. #524 changed a bit too much impacting other PRs :-(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants