Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEK rotation improvements. #2584

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

corrideat
Copy link
Member

@corrideat corrideat commented Feb 1, 2025

Closes #2570.

Testing instructions:

  1. Create three users (alice, bob, carol)
  2. alice creates a group G1
  3. bob creates a group G2
  4. carol joins G1
  5. carol leaves G1. There are no warning or errors for /rotateKeys. The pek for carol (in _vm.authorizedKeys) has been rotated.
  6. carol joins G2
  7. bob observes that the username for carol appears in G2.

@corrideat corrideat requested a review from taoeffect February 1, 2025 22:22
@corrideat corrideat self-assigned this Feb 1, 2025
backend/push.js Outdated
@@ -239,6 +239,7 @@ export const postEvent = async (subscription: Object, event: ?string): Promise<v
})

if (!req.ok) {
console.debug('Error sending push notification', subscription.id, req.status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be console.debug or something higher like log or info?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I thought debug is appropriate since it's by definition a user-defined endpoint. Nothing stops a user from reporting an endpoint like https://www.bankofamerica.com/, and since it's a third-party endpoint, we also can't control the SLA.

Copy link

cypress bot commented Feb 2, 2025

group-income    Run #3884

Run Properties:  status check passed Passed #3884  •  git commit 8103a06b0e ℹ️: Merge bdf61237a1d8c4b388141391892b0e98c51be000 into 5b89d7022aca2200db3d2fb01f94...
Project group-income
Branch Review 2570-failure-to-show-username-for-contract-upon-join
Run status status check passed Passed #3884
Run duration 12m 13s
Commit git commit 8103a06b0e ℹ️: Merge bdf61237a1d8c4b388141391892b0e98c51be000 into 5b89d7022aca2200db3d2fb01f94...
Committer Ricardo Iván Vieitez Parra
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 0
Tests that did not run due to a developer annotating a test with .skip  Pending 10
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 115
View all changes introduced in this branch ↗︎

@@ -947,6 +947,8 @@ export default ({
}
sbp('gi.actions/identity/kv/setChatRoomReadUntil', {
contractID: chatRoomID, messageHash, createdHeight
}).catch(e => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix

@@ -537,6 +538,20 @@ export default (sbp('sbp/selectors/register', {
const keyId = findSuitableSecretKeyId(contractIDOrState, permissions, purposes, ringLevel, allowedActions)
return keyId
},
'chelonia/contract/setPendingKeyRevocation': function (contractID: string, names: string[]) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improvement as per contract TODOs.

@@ -1693,9 +1708,17 @@ async function outEncryptedOrUnencryptedAction (
if (opType === GIMessage.OP_ACTION_ENCRYPTED && !params.encryptionKeyId) {
throw new Error('OP_ACTION_ENCRYPTED requires an encryption key ID be given')
}
if (params.encryptionKey) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new parameter is needed to designate a raw key to use for encryption in OP_ATOMIC. The current implementation of GIMessage / encryptedData / signedData / chelonia isn't advanced enough to automatically switch to the new key on outgoing messages. The workaround is designating the key as a raw key.

@@ -108,7 +108,7 @@ const decryptData = function (height: number, data: any, additionalKeys: Object,
const key = additionalKeys[eKeyId]

if (!key) {
throw new ChelErrorDecryptionKeyNotFound(`Key ${eKeyId} not found`)
throw new ChelErrorDecryptionKeyNotFound(`Key ${eKeyId} not found`, { cause: eKeyId })
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the cause here helps identify the key that failed.

@@ -113,7 +113,7 @@ Cypress.Commands.add('getByDT', (element, otherSelector = '') => {
cySbpCheckCommand('giNoPendingGroupKeyShares', (sbp) => {
const state = sbp('state/vuex/state')
const pending = Object.keys(state.contracts)
.filter(contractID => state[contractID]._vm.type === 'gi.contracts/group')
.filter(contractID => state[contractID]._vm?.type === 'gi.contracts/group')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, sometimes _vm was undefined in the tests, causing this to fail.

@@ -2091,6 +2116,15 @@ const handleEvent = {
}
this.config.reactiveSet(state.contracts[contractID], 'HEAD', hash)
this.config.reactiveSet(state.contracts[contractID], 'height', height)
if (missingDecryptionKeyId) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a log of missed decryption keys, which can be used to more accurately determine when a contract needs to be re-synced.

@@ -2046,17 +2066,22 @@ const handleEvent = {
return acc
}

const actionsOpV = ((msg: any): GIOpAtomic).reduce(reducer, [])
const actionsOpV = (Number.isFinite(errorIndex)
// $FlowFixMe[incompatible-call]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The side-effects sometimes get called if there was an error (i.e., ChelErrorWarning has this behaviour). This only executes side-effects up to the point where process succeeded in OP_ATOMIC.

Copy link
Member

@taoeffect taoeffect Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so based on the call here's how we've agreed conceptually this code should work:

There are two types of "errors" that we need to consider:

  1. "Ignoring" errors
  2. "Failure" errors

Example: OP_KEY_ADD

  1. IGNORING: an error is thrown because we wanted to add a key but the key we wanted to add is already there. This is not a hard error, it's an ignoring error. We don't care that the operation failed in this case because the intent was accomplished.
  2. FAILURE: an error is thrown because we wanted to add a key that doesn't exist.

Example: OP_ACTION_ENCRYPTED

  1. IGNORING: An error is thrown because we don't have the key to decrypt the action. We ignore it.
  2. FAILURE: An error is thrown by the process function during processing.

Handling these in OP_ATOMIC

  • ALL errors of class "IGNORING" should be ignored. They should not impact our ability to process the rest of the operations in the OP_ATOMIC. No matter how many of these are thrown, it doesn't affect the rest of the operations.
  • ANY error of class "FAILURE" will call the rest of the operations to fail and the state to be reverted to prior to the OP_ATOMIC. No side-effects should be run. Because an intention failed.

Feel free to copy this 👆 as a comment in the code.

}
await opFns[u[0]](u[1])
} catch (e) {
e[errorIndexSymbol] = i
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point where the error occurred.

return
}
const attributes = state[contractID].attributes
const attributes = state[contractID]?.attributes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the ProfileCard not showing if the username isn't known.

@@ -49,6 +49,16 @@ console.info('CONTRACTS_VERSION:', process.env.CONTRACTS_VERSION)
console.info('LIGHTWEIGHT_CLIENT:', process.env.LIGHTWEIGHT_CLIENT)
console.info('NODE_ENV:', process.env.NODE_ENV)

if (process.env.CI) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add logging information on CI to get a better stack trace and identify missing error handling.

@@ -191,8 +187,9 @@ sbp('sbp/selectors/register', {
contractID,
contractName,
data: [
...keyShares.map((data) => ['chelonia/out/keyShare', { data: encryptedOutgoingData(contractID, CEKid, data) }]),
['chelonia/out/keyUpdate', { data: updatedKeys }]
...(keyShares[0] ?? []),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function updated to support arbitrary operations. This also makes this function generic and no longer GI-specific.

return [
undefined, // Nothing before OP_KEY_UPDATE
[
// Re-encrypt attributes with the new PEK
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 'the fix' for this issue, i.e., re-encrypting attributes.

Copy link
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary review based on call ready!

@@ -2046,17 +2066,22 @@ const handleEvent = {
return acc
}

const actionsOpV = ((msg: any): GIOpAtomic).reduce(reducer, [])
const actionsOpV = (Number.isFinite(errorIndex)
// $FlowFixMe[incompatible-call]
Copy link
Member

@taoeffect taoeffect Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so based on the call here's how we've agreed conceptually this code should work:

There are two types of "errors" that we need to consider:

  1. "Ignoring" errors
  2. "Failure" errors

Example: OP_KEY_ADD

  1. IGNORING: an error is thrown because we wanted to add a key but the key we wanted to add is already there. This is not a hard error, it's an ignoring error. We don't care that the operation failed in this case because the intent was accomplished.
  2. FAILURE: an error is thrown because we wanted to add a key that doesn't exist.

Example: OP_ACTION_ENCRYPTED

  1. IGNORING: An error is thrown because we don't have the key to decrypt the action. We ignore it.
  2. FAILURE: An error is thrown by the process function during processing.

Handling these in OP_ATOMIC

  • ALL errors of class "IGNORING" should be ignored. They should not impact our ability to process the rest of the operations in the OP_ATOMIC. No matter how many of these are thrown, it doesn't affect the rest of the operations.
  • ANY error of class "FAILURE" will call the rest of the operations to fail and the state to be reverted to prior to the OP_ATOMIC. No side-effects should be run. Because an intention failed.

Feel free to copy this 👆 as a comment in the code.

Comment on lines +193 to +196
if (!state.chatRooms) {
// When creating a DM, we may not have the `.chatRooms` property
state.chatRooms = Object.create(null)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we not have the .chatRooms property? It's created by the constructor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That requires the constructor being called. It won't be if the decryption key isn't available (for example, because it's been rotated).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here explaining this? I think most people reading this code will have no idea about this.

}
await opFns[u[0]](u[1])
} catch (e) {
if (e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there an if (e) thing here...? In what situation can this not be defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw undefined, throw null, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it still make sense then to do something here, like log a console.error in an else { .. } branch?

@@ -801,7 +845,9 @@ export default (sbp('sbp/selectors/register', {
key: deserializeKey(decrypted),
transient
}]))
if (
if (missingDecryptionKeyIds?.includes(key.id)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here explaining what this is doing?

Copy link
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review is ready!

Claude's review of internals.js

After reviewing the diff and the full file, here are a few potential issues and observations:

  1. In the new code for handling errors in OP_ATOMIC, there's a potential issue with the missingDecryptionKeyIds handling. The code checks if missingDecryptionKeyIds is a Set, but later in the file, it's treated as an array. This inconsistency could lead to runtime errors.
// In OP_ATOMIC:
const missingDecryptionKeyIds = missingDecryptionKeyIdsMap.get(message)
if (missingDecryptionKeyIds) {
  missingDecryptionKeyIds.add(e.cause)
} else {
  missingDecryptionKeyIdsMap.set(message, new Set([e.cause]))
}

// Later in applyProcessResult:
if (Array.isArray(missingDecryptionKeyIdsForMessage)) { // This expects an array, not a Set
  // ...
}
  1. The error handling in OP_ATOMIC might have an issue with the nested conditions:
if (e) {
  if (e.name === 'ChelErrorDecryptionKeyNotFound') {
    // ...
  } else {
    // ...
  }
  // ...
  if (e.name === 'ChelErrorWarning') continue
}
throw e

The throw e is outside the first if (e) block, which means it will always throw the error even if e is falsy. It should probably be inside the first if (e) block.

  1. There's a potential memory leak with the WeakMap usage. While WeakMaps are good for garbage collection, the code should ensure that all message objects are properly cleaned up when they're no longer needed. The code does include cleanup in the finally block, but it might be worth adding additional cleanup in error cases.

  2. In the new code that handles missing decryption key IDs, there's a duplicate height update:

if (Array.isArray(missingDecryptionKeyIdsForMessage)) {
  // ...
  missingDecryptionKeyIdsForMessage.forEach(keyId => {
    if (missingDecryptionKeyIds.includes(keyId)) return
    missingDecryptionKeyIds.push(keyId)
  })
  this.config.reactiveSet(state.contracts[contractID], 'height', height) // This is redundant
}

This height update is redundant because it's already being set earlier in the function:

this.config.reactiveSet(state.contracts[contractID], 'height', height)

These issues should be addressed to improve the robustness of the code.

Gemini 2.0 Flash Thinking's Review of internals.js

Screenshot 2025-02-05 at 2 10 57 PM

Comment on lines +318 to +329
it('user3 can see the username for userBot', () => {
cy.giLogin(`user3-${userId}`, { bypassUI: true })
assertMembersCount(3)

cy.getByDT('groupMembers').contains('userbot-')
cy.window().its('sbp').then(sbp => {
const rootState = sbp('state/vuex/state')
const userbotId = rootState.namespaceLookups[`userbot-${userId}`]
const PEKs = Object.values(rootState[userbotId]._vm.authorizedKeys).filter(key => key.name === 'pek')
if (PEKs.length < 2) throw new RangeError('Expected the PEK to have been rotated')
})
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure to show username for contract upon join
2 participants