Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure decrypted internal keys never swapped #185

Draft
wants to merge 2 commits into
base: TDE_REL_17_STABLE
Choose a base branch
from

Conversation

dAdAbird
Copy link
Member

@dAdAbird dAdAbird commented Apr 2, 2025

We have already had a locked memory for internal keys. But we read and created keys into usually palloced memory before copying them to the locked memory.

This commit changes the approach. Now, the user must allocate (acquire) a locked memory first and use it as a buffer for decryption/creating the key. This locked memory is kinda generic, so the user requests an amount of bytes (rather than objects). Although currently, during the key search, we assume that everything the memory contains only RelKeyCacheRec objects. With two exceptions: 1) server creates its own locked page to store the WAL write key;
2) pg_tde_perform_rotate_key cheekily allocates a space there for the keys re-encryption but releases this memory when it's done.

In future, if we switch to the full _map files encryption, we can gulp (mmap) the whole file into that memory.

Fixes PG-1445

@dAdAbird dAdAbird requested a review from dutow as a code owner April 2, 2025 15:05
@dAdAbird dAdAbird requested review from jeltz and dutow and removed request for dutow April 2, 2025 15:05
@dAdAbird dAdAbird force-pushed the no_keys_swapped branch 2 times, most recently from d115b2e to 023928c Compare April 2, 2025 16:00
Copy link
Collaborator

@jeltz jeltz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not the biggest fan of this approach of heavy interaction between the cache and storage. Especially if we want to change from linear cache to hash map for example. But it is not like I have a better suggestion which does not involve writing our own allocator.

Plus I think the keys in TDESMgrRelationData still can be swapped to disk.

*/
if (key)
{
long pageSize = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to have a default value if we are always assigning it a couple of lines below.

pageSize = getpagesize();
#else
pageSize = sysconf(_SC_PAGESIZE);
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL itself always calls sysconf(_SC_PAGESIZE); so it is apparently guaranteed to be supported on all platofrms except Windows. Do we need the fallback to getpagesize();?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, we should drop getpagesize(). I was playing safe but it is a legacy call and was removed in POSIX.1-2001. Doesn't make sense to have it

@@ -222,15 +241,15 @@ tdeheap_xlog_seg_write(int fd, const void *buf, size_t count, off_t offset,
*
* This func called with WALWriteLock held, so no need in any extra sync.
*/
if (EncryptionKey.rel_type & TDE_KEY_TYPE_GLOBAL &&
if (EncryptionKey && EncryptionKey->rel_type & TDE_KEY_TYPE_GLOBAL &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change can't we remove the check for rel_type & TDE_KEY_TYPE_GLOBAL since all keys will have that flag. The reaosn we needed it was to handle if EncryptionKey had not been set.

DECLARE idx integer;
begin
for idx in 0..700 loop
EXECUTE format('CREATE TABLE t%s (c1 int) USING tde_heap', idx);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it matters in tests but in real production code I would have done.

EXECUTE format('CREATE TABLE %I (c1 int) USING tde_heap', 't' || idx);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should I change it? I took it from the cache_alloc regression test. If to change, that it make sense to fix the regression test as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you. I just wanted to share how real production code looks like.


LWLockAcquire(tde_lwlock_enc_keys(), LW_SHARED);

principal_key = GetPrincipalKey(newrlocator->dbOid, LW_SHARED);
if (principal_key == NULL)
{
ereport(ERROR,
(errmsg("principal key not configured"),
(errmsg("principal key is not configured"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this should be fixed it should be fixed in all places and as a separate commit.

}

tde_release_safe_mem(sizeof(InternalKey));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is quite dangerous, maybe we should actually pass the pointer to assert that we actually release from the end.

MemoryContext oldCtx;
int needPages = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to have a default here.

MemoryContext oldCtx;
int needPages = 0;
unsigned char *newPages = NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same hjere.


tde_rel_key_cache.len = 0;
tde_rel_key_cache.cap = (pageSize - 1) / sizeof(RelKeyCacheRec);
goto RETURN;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL uses lowercase for labels.

We have already had a locked memory for internal keys. But we read and
created keys into usually palloced memory before copying them to the
locked memory.

This commit changes the approach. Now, the user must allocate (acquire)
a locked memory first and use it as a buffer for decryption/creating
the key. This locked memory is kinda generic, so the user requests an
amount of bytes (rather than objects). Although currently, during the
key search, we assume that everything the memory contains only
`RelKeyCacheRec` objects. With two exceptions: 1) server creates its
own locked page to store the WAL write key;
2) `pg_tde_perform_rotate_key` cheekily allocates a space there for
the keys re-encryption but releases this memory when it's done.

In future, if we switch to the full _map files encryption, we can gulp
(mmap) the whole file into that memory.

Fixes PG-1445
@dAdAbird dAdAbird requested a review from jeltz April 7, 2025 14:56
@dAdAbird dAdAbird marked this pull request as draft April 8, 2025 11:59
Smgr has its own cache, TDESMgrRelation, which stores the state and other. We use it to store the Internal key. Although we have to store a pointer to the key in the secure memory rather than a copy of the key. But when secure memory grows, it moves the object, therefore, the smgr pointer becomes invalid. Moreover, pointers become invalid even w/o growth. The simplest scenario:
```
CREATE TEMP TABLE articles (
    id int CONSTRAINT articles_pkey PRIMARY KEY,
    keywords text,
    title text UNIQUE NOT NULL,
    body text UNIQUE,
    created date
);

SELECT id, keywords, title, body, created FROM articles GROUP BY id;
ERROR:  invalid page in block 0 of relation base/16384/t0_32940
```

This commit removes the internal key pointer from smgr and replaces it with a GetSMGRRelationKey call. By the time of this call, the key would be in a cache. Though this will have performance drawbacks and should be benchmarked and re-evaluated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants