Skip to content

Commit 1d3c6f9

Browse files
committed
Google KMS support and audit_callback
1 parent 40d13fb commit 1d3c6f9

19 files changed

+465
-87
lines changed

CHANGELOG.md

+3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
Changelog
22
=========
33

4+
## [v0.0.24] - 2024-05-15
5+
- Google KMS support and audit_callback
6+
47
## [v0.0.23] - 2024-04-26
58
- Google Firestore support
69

README.md

+35-9
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ For optional [client side](#client-side-encryption) field level envelope encrypt
5252
```
5353
pip install 'abnosql[aws-kms]'
5454
pip install 'abnosql[azure-kms]'
55+
pip install 'abnosql[gcp-kms]'
5556
```
5657

5758
By default, abnosql does not include database dependencies. This is to facilitate packaging
@@ -195,17 +196,27 @@ This works for AWS DyanmoDB & Firestore, however Azure Cosmos has a limitation w
195196

196197
## Audit
197198

198-
`put_item()` and `put_items()` take an optional `audit_user` kwarg. If supplied, absnosql will add the following to the item:
199+
Table config attribute `audit_user` will add the following to the item being written to database:
199200

200201
- `createdBy` - value of `audit_user`, added if does not exist in item supplied to put_item()
201202
- `createdDate` - UTC ISO timestamp string, added if does not exist
202203
- `modifiedBy` - value of `audit_user` always added
203204
- `modifiedDate` - UTC ISO timestamp string, always added
204205

205-
You can also specify `audit_user` as config attribute to table. If you prefer snake_case over CamelCase, you can set env var `ABNOSQL_CAMELCASE` = `FALSE`
206+
If snake_case over CamelCase is preferred, set env var `ABNOSQL_CAMELCASE` = `FALSE`
206207

207208
NOTE: created* will only be added if `update` is not True in a `put_item()` operation
208209

210+
Table config attribute `audit_callback` with value as a function callback can be used to hook into additional audit stores.
211+
212+
Callback function must accept the following positional args:
213+
214+
- `table_name` - table name
215+
- `dt_iso` - ISO date timestamp
216+
- `operation` - `create`, `update`, `get` or `delete`
217+
- `key` - key of item serialised in <key>=<value>; format
218+
- `audit_user` - user performing the operation
219+
209220
## Change Feed / Stream Support
210221

211222
**AWS DynamoDB** [Streams](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html) allow Lambda functions to be triggered upon create, update and delete table operations. The event sent to the lambda (see [aws docs](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.Tutorial2.html)) contains `eventName` and `eventSourceARN`, where:
@@ -240,14 +251,15 @@ To write an Azure Function / AWS Lambda that is able to process both DynamoDB an
240251

241252
## Client Side Encryption
242253

243-
If configured in table config with `kms` attribute, abnosql will perform client side encryption using AWS KMS or Azure KeyVault
254+
If configured in table config with `kms` attribute, abnosql will perform client side encryption using AWS KMS, Azure KeyVault or Google KMS
244255

245256
Each attribute value defined in the config is encrypted with a 256-bit AES-GCM data key generated for each attribute value:
246257

247258
- `aws` uses [AWS Encryption SDK for Python](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/python.html)
248-
- `azure` uses [python cryptography](https://cryptography.io/en/latest/hazmat/primitives/aead/#cryptography.hazmat.primitives.ciphers.aead.AESGCM.generate_key) to generate AES-GCM data key, encrypt the attribute value and then uses an RSA CMK in Azure Keyvault to wrap/unwrap (envelope encryption) the AES-GCM data key. The module uses the [azure-keyvaults-keys](https://learn.microsoft.com/en-us/python/api/overview/azure/keyvault-keys-readme?view=azure-python) python SDK for wrap/unrap functionality of the generated data key (Azure doesnt support generate data key as AWS does)
259+
- `azure` uses [python cryptography](https://cryptography.io/en/latest/hazmat/primitives/aead/#cryptography.hazmat.primitives.ciphers.aead.AESGCM.generate_key) to generate AES-GCM data key, encrypt the attribute value and then uses an RSA CMK in Azure Keyvault to wrap/unwrap (envelope encryption) the AES-GCM data key. The plugin uses the [azure-keyvault-keys](https://learn.microsoft.com/en-us/python/api/overview/azure/keyvault-keys-readme?view=azure-python) python SDK for wrap/unrap functionality of the generated data key (Azure doesnt support generate data key as AWS does - see also [tink issue](https://github.com/tink-crypto/tink/issues/158#issuecomment-1382589658))
260+
- `gcp` uses [Google Tink](https://developers.google.com/tink/client-side-encryption)
249261

250-
Both providers use a [256-bit AES-GCM](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/supported-algorithms.html) generated data key with AAD/encryption context (Azure provider uses a 96-nonce). AES-GCM is an Authenticated symmetric encryption scheme used by both AWS and Azure (and [Hashicorp Vault](https://developer.hashicorp.com/vault/docs/secrets/transit#aes256-gcm96))
262+
All providers use a [256-bit AES-GCM](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/supported-algorithms.html) generated data key with AAD/encryption context (Azure provider uses a 96-nonce). AES-GCM is an Authenticated symmetric encryption scheme used by AWS, Azure & Google (and [Hashicorp Vault](https://developer.hashicorp.com/vault/docs/secrets/transit#aes256-gcm96))
251263

252264
See also [AWS Encryption Best Practices](https://docs.aws.amazon.com/prescriptive-guidance/latest/encryption-best-practices/welcome.html)
253265

@@ -256,15 +268,23 @@ Example config:
256268
```
257269
{
258270
'kms': {
271+
# Azure example
259272
'key_ids': ['https://foo.vault.azure.net/keys/bar/45e36a1024a04062bd489db0d9004d09'],
273+
274+
# AWS example
275+
# 'key_ids': ['arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab'],
276+
277+
# Google Example
278+
# 'key_ids': ['gcp-kms://projects/p1/locations/global/keyRings/kr1/cryptoKeys/ck1'],
279+
260280
'key_attrs': ['hk', 'rk'],
261281
'attrs': ['obj', 'str']
262282
}
263283
}
264284
```
265285

266286
Where:
267-
- `key_ids`: list of AWS KMS Key ARNs or Azure KeyVault identifier (URL to RSA CMK). This is picked up via `ABNOSQL_KMS_KEYS` env var as a comma separated list (*NOTE: env var recommended to avoid provider specific code*)
287+
- `key_ids`: list of AWS KMS Key ARNs, Azure KeyVault identifier (URL to RSA CMK) or Google KMS URI. This is picked up via `ABNOSQL_KMS_KEYS` env var as a comma separated list (*NOTE: env var recommended to avoid provider specific code*)
268288
- `key_attrs`: list of key attributes in the item from which the AAD/encryption context is set. Taken from `ABNOSQL_KEY_ATTRS` env var or table `key_attrs` if defined there
269289
- `attrs`: list of attributes keys to encrypt
270290
- `key_bytes`: optional for azure, use your own AESGCM key if specified, otherwise generate one
@@ -383,6 +403,7 @@ abnosql uses pluggy and registers in the `abnosql.table` namespace
383403
The following hooks are available
384404

385405
- `set_config` - set config
406+
- `get_item_pre`
386407
- `get_item_post` - called after `get_item()`, can return modified data
387408
- `put_item_pre`
388409
- `put_item_post`
@@ -434,20 +455,25 @@ More examples in [tests/test_cosmos.py](./tests/test_cosmos.py)
434455

435456
## Google Firestore
436457

437-
Use [python-mock-firestore](https://github.com/mdowds/python-mock-firestore) and pass `MockFirestore()` to table config as `client` attribute
458+
Use [python-mock-firestore](https://github.com/mdowds/python-mock-firestore) and pass `MockFirestore()` to table config as `client` attribute, or patch get_client()
438459

439460
Example:
440461

441462
```
463+
from unittest.mock import patch
442464
from mockfirestore import MockFirestore
465+
from abnosql.plugins.table.firestore import Table as FirestoreTable
443466
444467
468+
@patch.object(FirestoreTable, 'get_client', MockFirestore)
445469
def test_something():
446-
tb = table('mytable', {'client': MockFirestore()})
470+
tb = table('mytable', {})
447471
item = tb.get_item(foo='bar')
448472
449473
```
450474

475+
More examples in [tests/test_firestore.py](./tests/test_firestore.py)
476+
451477
# CLI
452478

453479
Small abnosql CLI installed with few of the commands above
@@ -489,7 +515,7 @@ p2 p2.2 5 {'foo': 'bar', 'num': 5, 'list': [1, 2, 3]} [1, 2, 3]
489515
- [x] client side encryption
490516
- [x] test pagination & exception handling
491517
- [x] [Google Firestore](https://cloud.google.com/python/docs/reference/firestore/latest) support, ideally in the core library (though could be added outside via use of the plugin system). Would need something like [FireSQL](https://firebaseopensource.com/projects/jsayol/firesql/) implemented for python, maybe via sqlglot
492-
- [ ] [Google Vault](https://cloud.google.com/python/docs/reference/cloudkms/latest/) KMS support
518+
- [x] [Google Vault](https://cloud.google.com/python/docs/reference/cloudkms/latest/) KMS support
493519
- [ ] [Hashicorp Vault](https://github.com/hashicorp/vault-examples/blob/main/examples/_quick-start/python/example.py) KMS support
494520
- [ ] Simple caching (maybe) using globals (used for AWS Lambda / Azure Functions)
495521
- [ ] PostgresSQL support using JSONB column (see [here](https://medium.com/geekculture/json-and-postgresql-using-json-to-mimic-nosqls-storage-benefits-1564c69f61fc) for example). Would be nice to avoid an ORM and having to define a model for each table...

abnosql/plugins/kms/aws.py

+6-5
Original file line numberDiff line numberDiff line change
@@ -32,14 +32,14 @@ def wrapper(*args, **kwargs):
3232
except ClientError as e:
3333
code = e.response['Error']['Code']
3434
if raise_not_found and code in ['ResourceNotFoundException']:
35-
raise ex.NotFoundException(e) from None
35+
raise ex.NotFoundException(detail=e) from None
3636
elif code == 'UnrecognizedClientException':
37-
raise ex.ConfigException(e) from None
38-
raise ex.ValidationException(e) from None
37+
raise ex.ConfigException(detail=e) from None
38+
raise ex.ValidationException(detail=e) from None
3939
except NoCredentialsError as e:
40-
raise ex.ConfigException(e) from None
40+
raise ex.ConfigException(detail=e) from None
4141
except Exception as e:
42-
raise ex.PluginException(e)
42+
raise ex.PluginException(detail=e)
4343
return wrapper
4444
return decorator
4545

@@ -51,6 +51,7 @@ def __init__(
5151
) -> None:
5252
self.pm = pm
5353
self.config = config or {}
54+
self.provider = 'aws'
5455
self.session = self.config.get(
5556
'session', Session()
5657
)

abnosql/plugins/kms/azure.py

+41-25
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,10 @@ def wrapper(*args, **kwargs):
3232
return func(*args, **kwargs)
3333
except azex.ResourceNotFoundError as e:
3434
if raise_not_found:
35-
raise ex.NotFoundException(e.message) from None
35+
raise ex.NotFoundException(detail=e.message) from None
3636
return None
3737
except Exception as e:
38-
raise ex.PluginException(e)
38+
raise ex.PluginException(detail=e)
3939
return wrapper
4040
return decorator
4141

@@ -47,6 +47,7 @@ def __init__(
4747
) -> None:
4848
self.pm = pm
4949
self.config = config or {}
50+
self.provider = 'azure'
5051
key_ids = self.config.get('key_ids', get_keys())
5152
if not isinstance(key_ids, list) or len(key_ids) == 0:
5253
raise ex.ConfigException('kms key_ids required')
@@ -56,6 +57,9 @@ def __init__(
5657
'credential', DefaultAzureCredential()
5758
)
5859
)
60+
self.pack_bytes_maxlen = self.config.get(
61+
'pack_bytes_maxlen', 10000
62+
)
5963

6064
@kms_ex_handler()
6165
def encrypt(
@@ -64,46 +68,58 @@ def encrypt(
6468
# azure doesnt have GenerateDataKey equivilent
6569
# as AWS does, and its encrypt/decrypt APIs
6670
# are only for use against CMKs not data keys
67-
# so we must do our own AESGCM key to encrypt/decrypt
68-
# the plaintext and then use azure to wrap/unwrap
69-
# this with CMK.
70-
# This follows similar pattern to aws-encryption-sdk
71+
# so generate own AESGCM DEK to encrypt/decrypt
72+
# the plaintext and then use azure to wrap/unwrap this with CMK.
73+
# This follows similar pattern to aws-encryption-sdk and google tink
7174
# and the wrapped/encrypted AES key lives with the data
75+
# see https://developers.google.com/tink/client-side-encryption
76+
# https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/concepts.html # noqa
7277
context = dict(sorted(context.items()))
7378
aad = json.dumps(context).encode()
79+
80+
# 1) generate random Data Encryption Key (DEK)
7481
# 256-bit AES-GCM key with 96-bit nonce
7582
nonce = os.urandom(96)
76-
key = key or AESGCM.generate_key(bit_length=256)
77-
aesgcm = AESGCM(key)
78-
# encrypt the key using Azure Key Vault CMK
79-
enc_key = self.crypto_client.wrap_key(
80-
KeyWrapAlgorithm.rsa_oaep_256, key
83+
dek = key or AESGCM.generate_key(bit_length=256)
84+
dek_aesgcm = AESGCM(dek)
85+
86+
# 2) The DEK is encrypted by a Key Encryption Key (KEK)
87+
# that is stored in a cloud KMS (Azure Key Vault CMK)
88+
enc_dek = self.crypto_client.wrap_key(
89+
KeyWrapAlgorithm.rsa_oaep_256, dek
8190
).encrypted_key
82-
# delete unencrypted key from memory asap
83-
del key
84-
# encrypt
85-
ct = aesgcm.encrypt(nonce, plaintext.encode(), aad)
86-
del aesgcm
87-
# byte packing is smaller than json and what aws-encryption-sdk does
91+
del dek # delete unencrypted DEK from memory asap
92+
93+
# 3) Data is encrypted using the DEK by the client.
94+
ct = dek_aesgcm.encrypt(nonce, plaintext.encode(), aad)
95+
del dek_aesgcm
96+
97+
# 4) Concatenates the KEK-encrypted encryption DEK with the encrypted
98+
# data (byte packing is what aws-encryption-sdk and google tink do)
8899
serialized = b64encode(
89-
pack_bytes([ct, nonce, enc_key], 10000)
100+
pack_bytes([ct, nonce, enc_dek], self.pack_bytes_maxlen)
90101
).decode()
91102
return serialized
92103

93104
@kms_ex_handler()
94105
def decrypt(self, serialized: str, context: t.Dict) -> str:
95106
context = dict(sorted(context.items()))
96107
aad = json.dumps(context).encode()
108+
109+
# 1) Extracts the KEK-encrypted DEK key.
97110
unpacked = unpack_bytes(b64decode(serialized.encode()))
98111
if len(unpacked) != 3:
99112
raise ValueError('invalid serialization')
100-
(ct, nonce, enc_key) = unpacked
113+
(ct, nonce, enc_dek) = unpacked
114+
115+
# 2) Makes a request to your KMS to decrypt the KEK-encrypted DEK.
101116
# decrypt the key using Azure Key Vault CMK
102-
key = self.crypto_client.unwrap_key(
103-
KeyWrapAlgorithm.rsa_oaep_256, enc_key # obj['key']
117+
dek = self.crypto_client.unwrap_key(
118+
KeyWrapAlgorithm.rsa_oaep_256, enc_dek
104119
).key
105-
aesgcm = AESGCM(key)
106-
del key
107-
plaintext = aesgcm.decrypt(nonce, ct, aad).decode()
108-
# plaintext = aesgcm.decrypt(obj['nonce'], obj['ct'], aad).decode()
120+
121+
# 3) Decrypts the ciphertext locally using the DEK.
122+
dek_aesgcm = AESGCM(dek)
123+
del dek
124+
plaintext = dek_aesgcm.decrypt(nonce, ct, aad).decode()
109125
return plaintext

abnosql/plugins/kms/gcp.py

+85
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
from base64 import b64decode
2+
from base64 import b64encode
3+
import functools
4+
import json
5+
import os
6+
import typing as t
7+
8+
import abnosql.exceptions as ex
9+
from abnosql.kms import get_keys
10+
from abnosql.kms import KmsBase
11+
from abnosql.plugin import PM
12+
13+
try:
14+
from tink import aead # type: ignore
15+
from tink import core # type: ignore
16+
from tink.integration import gcpkms # type: ignore
17+
from tink import new_keyset_handle # type: ignore
18+
except ImportError:
19+
MISSING_DEPS = True
20+
21+
22+
def kms_ex_handler(raise_not_found: t.Optional[bool] = True):
23+
def decorator(func):
24+
@functools.wraps(func)
25+
def wrapper(*args, **kwargs):
26+
try:
27+
return func(*args, **kwargs)
28+
except core.TinkError as e:
29+
raise ex.ConfigException(detail=str(e))
30+
except Exception as e:
31+
raise ex.PluginException(detail=e)
32+
return wrapper
33+
return decorator
34+
35+
36+
def mock_remote_aead(*args, **kwargs):
37+
# used for patching during tests
38+
# see https://github.com/tink-crypto/tink-py/blob/main/tink/aead/_kms_envelope_aead_test.py # noqa
39+
keyset_handle = new_keyset_handle(aead.aead_key_templates.AES256_GCM)
40+
return keyset_handle.primitive(aead.Aead)
41+
42+
43+
class Kms(KmsBase):
44+
45+
@kms_ex_handler()
46+
def __init__(
47+
self, pm: PM, config: t.Optional[dict] = None
48+
) -> None:
49+
self.pm = pm
50+
self.config = config or {}
51+
self.provider = 'gcp'
52+
53+
self.key_ids = self.config.get('key_ids', get_keys())
54+
if not isinstance(self.key_ids, list) or len(self.key_ids) == 0:
55+
raise ex.ConfigException('kms key_ids required')
56+
self.kek_uri = self.key_ids[0]
57+
self.credentials = self.config.get(
58+
'credentials', os.environ.get('GOOGLE_APPLICATION_CREDENTIALS')
59+
)
60+
# see https://developers.google.com/tink/client-side-encryption
61+
aead.register()
62+
self.client = gcpkms.GcpKmsClient(
63+
self.kek_uri,
64+
self.credentials
65+
)
66+
remote_aead = self.client.get_aead(self.kek_uri)
67+
self.env_aead = aead.KmsEnvelopeAead(
68+
aead.aead_key_templates.AES256_GCM, remote_aead
69+
)
70+
71+
@kms_ex_handler()
72+
def encrypt(
73+
self, plaintext: str, context: t.Dict, key: t.Optional[bytes] = None
74+
) -> str:
75+
ciphertext = self.env_aead.encrypt(
76+
plaintext.encode(), json.dumps(context).encode()
77+
)
78+
return b64encode(ciphertext).decode()
79+
80+
@kms_ex_handler()
81+
def decrypt(self, serialized: str, context: t.Dict) -> str:
82+
plaintext = self.env_aead.decrypt(
83+
b64decode(serialized), json.dumps(context).encode()
84+
)
85+
return plaintext.decode()

0 commit comments

Comments
 (0)