What's Changed
- Add ref_input parameter to support separate inputs for reference model by @xingyaoww in #467
- Revert "Add ref_input parameter to support separate inputs for reference model" by @ByronHsu in #469
- Add dynamic dependency management for CUDA and ROCm by @hebiao064 in #460
- [CI] runtime pip install using uv by @ByronHsu in #471
- modify ref_input in chunked_loss base class and fix tests by @shivam15s in #470
- Add more post training in readme by @ByronHsu in #472
- align post training loss at the center by @ByronHsu in #473
- [Transformer] fix ORPO loss for MOE models by @kashif in #479
- fix: correct typos in docstrings by @shivam15s in #482
- fix chosen_nll_loss in chunked losses by @kashif in #486
- Revert "fix chosen_nll_loss in chunked losses (#486)" by @shivam15s in #489
- fix dpo tests: reduce tolerance and change default compute_nll_loss false by @shivam15s in #490
- CPO & SimPO add label_smoothing by @Mecoli1219 in #493
- Fix Preference Loss and Refactor for Readability by @austin362667 in #484
- annotate tl constexpr values by @winglian in #497
- Fix Rope Compatibility with Cos/Sin Position Embedding for Batch Size > 1 by @wizyoung in #477
- Move the checkstyle to Ruff by @shivam15s in #483
- Fix/liger fused linear cross entropy function does not support reduction=none by @ryankert01 in #496
- Fix Dtype Mismatch in torch.addmm within ops/fused_linear_cross_entropy.py in AMP training. by @DandinPower in #502
- Add weight support for LigerCrossEntropy by @Tcc0403 in #420
- Refactor Temperature Scaling in Distillation Loss by @austin362667 in #444
- Fix All
chunked_loss
Benchmark Scripts by @austin362667 in #438 - Set z_loss_1d=None when return_z_loss=False in cross_entropy_loss to avoid tl.store fail when triton_interpret=1(for tl.device_print etc.) by @wa008 in #508
- Add
aux_outputs
for CPO and SimPO by @Mecoli1219 in #492 - Add
average_log_prob
args for cpo by @Mecoli1219 in #510 - Refactor CrossEntropy and FusedLinearCrossEntropy by @Tcc0403 in #511
- [ORPO] add nll_target for orpo nll loss by @kashif in #503
- Format Benchmark Scripts with Ruff by @austin362667 in #516
- [Tiny] Add QVQ to readme by @tyler-romero in #522
- Add argument
return_z_loss
to flce by @Tcc0403 in #530 - Remove extra print by @apaz-cli in #531
- Fix HF
transformers
Breaking Changes by @austin362667 in #526 - Handle cache_position for transformers 4.47.0 and later (#528) by @BenasdTW in #529
- Create Docs for Liger-Kernel by @ParagEkbote in #485
- Add Mkdocs related dependencies to setup.py by @hebiao064 in #534
- Add KTO Loss by @hebiao064 in #475
- [tests] use a valid hexadecimal string instead of a placeholder by @faaany in #535
- [tests] skip failed tests for xpu by @faaany in #498
- Format files by @austin362667 in #541
- Fix Broken Links by @ParagEkbote in #547
- [Fix] Fix the type hint of
test_utils::concatenated_forward
by @hongpeng-guo in #549 - Add JSD Loss for Distillation by @austin362667 in #425
- [DPO] add reference log-prob outputs in DPO by @kashif in #521
- Fix DPO unit test fail and refactor by @Tcc0403 in #554
New Contributors
- @xingyaoww made their first contribution in #467
- @kashif made their first contribution in #479
- @Mecoli1219 made their first contribution in #493
- @winglian made their first contribution in #497
- @DandinPower made their first contribution in #502
- @wa008 made their first contribution in #508
- @apaz-cli made their first contribution in #531
- @BenasdTW made their first contribution in #529
- @ParagEkbote made their first contribution in #485
Full Changelog: v0.5.2...v0.5.3