Release v0.5.3: Minor fixes for post-training losses and support for KTO Loss · linkedin/Liger-Kernel

What's Changed

Add ref_input parameter to support separate inputs for reference model by @xingyaoww in #467
Revert "Add ref_input parameter to support separate inputs for reference model" by @ByronHsu in #469
Add dynamic dependency management for CUDA and ROCm by @hebiao064 in #460
[CI] runtime pip install using uv by @ByronHsu in #471
modify ref_input in chunked_loss base class and fix tests by @shivam15s in #470
Add more post training in readme by @ByronHsu in #472
align post training loss at the center by @ByronHsu in #473
[Transformer] fix ORPO loss for MOE models by @kashif in #479
fix: correct typos in docstrings by @shivam15s in #482
fix chosen_nll_loss in chunked losses by @kashif in #486
Revert "fix chosen_nll_loss in chunked losses (#486)" by @shivam15s in #489
fix dpo tests: reduce tolerance and change default compute_nll_loss false by @shivam15s in #490
CPO & SimPO add label_smoothing by @Mecoli1219 in #493
Fix Preference Loss and Refactor for Readability by @austin362667 in #484
annotate tl constexpr values by @winglian in #497
Fix Rope Compatibility with Cos/Sin Position Embedding for Batch Size > 1 by @wizyoung in #477
Move the checkstyle to Ruff by @shivam15s in #483
Fix/liger fused linear cross entropy function does not support reduction=none by @ryankert01 in #496
Fix Dtype Mismatch in torch.addmm within ops/fused_linear_cross_entropy.py in AMP training. by @DandinPower in #502
Add weight support for LigerCrossEntropy by @Tcc0403 in #420
Refactor Temperature Scaling in Distillation Loss by @austin362667 in #444
Fix All chunked_loss Benchmark Scripts by @austin362667 in #438
Set z_loss_1d=None when return_z_loss=False in cross_entropy_loss to avoid tl.store fail when triton_interpret=1(for tl.device_print etc.) by @wa008 in #508
Add aux_outputs for CPO and SimPO by @Mecoli1219 in #492
Add average_log_prob args for cpo by @Mecoli1219 in #510
Refactor CrossEntropy and FusedLinearCrossEntropy by @Tcc0403 in #511
[ORPO] add nll_target for orpo nll loss by @kashif in #503
Format Benchmark Scripts with Ruff by @austin362667 in #516
[Tiny] Add QVQ to readme by @tyler-romero in #522
Add argument return_z_loss to flce by @Tcc0403 in #530
Remove extra print by @apaz-cli in #531
Fix HF transformers Breaking Changes by @austin362667 in #526
Handle cache_position for transformers 4.47.0 and later (#528) by @BenasdTW in #529
Create Docs for Liger-Kernel by @ParagEkbote in #485
Add Mkdocs related dependencies to setup.py by @hebiao064 in #534
Add KTO Loss by @hebiao064 in #475
[tests] use a valid hexadecimal string instead of a placeholder by @faaany in #535
[tests] skip failed tests for xpu by @faaany in #498
Format files by @austin362667 in #541
Fix Broken Links by @ParagEkbote in #547
[Fix] Fix the type hint of test_utils::concatenated_forward by @hongpeng-guo in #549
Add JSD Loss for Distillation by @austin362667 in #425
[DPO] add reference log-prob outputs in DPO by @kashif in #521
Fix DPO unit test fail and refactor by @Tcc0403 in #554

New Contributors

@xingyaoww made their first contribution in #467
@kashif made their first contribution in #479
@Mecoli1219 made their first contribution in #493
@winglian made their first contribution in #497
@DandinPower made their first contribution in #502
@wa008 made their first contribution in #508
@apaz-cli made their first contribution in #531
@BenasdTW made their first contribution in #529
@ParagEkbote made their first contribution in #485

Full Changelog: v0.5.2...v0.5.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.3: Minor fixes for post-training losses and support for KTO Loss

What's Changed

New Contributors

Contributors