This repository has been archived by the owner on Jul 18, 2024. It is now read-only.
Highlights
This release introduces 3 new capabilities: RecDP-AutoFE, RecDP-LLM and DeltaTuner.
- RecDP-AutoFE provides automatic feature engineering capability to generate new features for any tabular dataset, this function is proven to be able to achieve competitive or even better accuracy comparing to data scientist's solution.
- RecDP-LLM is an one stop solution for LLM data preparation, it provides a ray and spark enhanced parallel data pipeline for pretrain data clean, RAG text extract/splitting/indexing, and finetune data quality evaluation and enhancement.
- DeltaTuner is an extension for Peft to improve LLM fine-tuning speed through multiple optimizations, including leveraging the compact model constructor denas to construct/modify the compact delta layers in a hardware-aware and train-free approach and adding more new deltatuning algorithms.
This release provides following major features:
Papers and Blogs
Versions and Components
- PyTorch >= 1.13.1
- Python 3.10
- Peft 0.4.0
- Pypark 3.4.1
- Ray 2.7.1
Links
- https://github.com/intel/e2eAIOK
- https://pypi.org/project/e2eAIOK-deltatuner/1.2.0/
- https://pypi.org/project/e2eAIOK-recdp/1.2.0/
Full Changelog: https://github.com/intel/e2eAIOK/commits/v1.2