- Escape special characters in rename operation (#297)
- No breaking changes.
- Fix Rust build (#275)
- Update user agent for DCP (#276)
- Address Rust security issue (#279)
- Refactor(benchmarks): Overhaul Lightning Checkpointing, DCP, dataset scenarios; add DynamoDB writes and results exploitation notebook (#274, #280, #285, #286)
- Add single rank PyTorch checkpoint benchmark (#289)
- Update torch version restriction (<2.5.0) and bind torchdata to last version with DataPipes (#283)
- No breaking changes.
- Add support of PyTorch distributed checkpoints (#269)
- Extend benchmark framework to support distributed checkpoints (#269)
- Add support of distributed training to S3IterableDataset (#269)
- No breaking changes.
- Add support for CRT retries (awslabs/mountpoint-s3#1069).
- Add support for
CopyObject
API (#242).
- No breaking changes.
- Add support of PyTorch Lightning checkpoints to benchmark suit (#226).
- Fix potential race condition while instantiating the
S3Client
(#237).
- No breaking changes.
- Enhanced error logging.
- Support tell for S3writer.
- Path-style addressing support.
- Update crates and Mountpoint dependencies.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- No breaking changes.
- Update
S3ClientConfig
to pass in the configuration for allowing unsigned requests, under boolean flagunsigned
. - Improve the performance of
S3Reader
when utilized withpytorch.load
by incorporating support for thereadinto
method. - [Experimental] Add support for passing an optional custom endpoint to
S3LightningCheckpoint
constructor method.
- No breaking changes.
- Expose a new class,
S3ClientConfig
, withthroughput_target_gbps
andpart_size
parameters of the inner S3 client.
- No breaking changes.
- Separate completely Rust logs and Python logs. Logs from Rust components used for debugging purposes
are configured through the following environment variables:
S3_TORCH_CONNECTOR_DEBUG_LOGS
,S3_TORCH_CONNECTOR_LOGS_DIR_PATH
.
- Add PyTorch Lightning checkpoints support
- Fix deadlock when enabling CRT debug logs. Removed former experimental method _enable_debug_logging().
- Refactor User-Agent setup for extensibility.
- Update lightning User-Agent prefix to
s3torchconnector/{__version__} (lightning; {lightning.__version__}
.
- No breaking changes.
- Support for Python 3.12.
- Additional logging when constructing Datasets, and when making requests to S3.
- Provide tooling for running benchmarks for S3 Connector for Pytorch.
- Update crates and Mountpoint dependencies.
- [Experimental] Allow passing in the S3 endpoint URL to Dataset constructors.
- HeadObject is no longer called when constructing datasets with
from_prefix
and seeking relative to end of file.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- Expose a logging method for enabling debug logs of the inner dependencies.
- No breaking changes.
- Update crates and Mountpoint dependencies.
- Avoid excessive memory consumption when utilizing
S3MapDataset
. Issue #89. - Run all tests against S3 and S3 Express.
- No breaking changes.
- The Amazon S3 Connector for PyTorch now supports S3 Express One Zone directory buckets.
- No breaking changes.
- The Amazon S3 Connector for PyTorch delivers high throughput for PyTorch training jobs that access and store data in Amazon S3.
S3IterableDataset
andS3MapDataset
, which allow building either an iterable-style or map-style dataset, using your S3 stored data, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is in.- Support for multiprocess data loading for the above datasets.
S3Checkpoint
, an interface for saving and loading model checkpoints directly to and from an S3 bucket.
- No breaking changes.