Releases: SciSharp/LLamaSharp
0.11.1 - LLaVA support
🎏 Major Changes
- LLaVA Support by @SignalRT in #556, #563, #609
- Chat session state management by @eublefar in #560
- Classifier Free Guidance by @martindevans in #536
- March Binary Update by @martindevans in #565
SetDllImportResolver
based loading by @martindevans in #603
📖 Documentation
- The documentation has been improved and no longer outdated, see LLamaSharp Documentation.
🔧 Bug Fixes
- Added conditional compilation code to progress_callback (in LlamaModelParams struct) by @clovisribeiro in #593
- Memory Disposal Tests by @martindevans in #551
- Fixed Publish File paths by @martindevans in #561
- llama_decode lock by @martindevans in #595
- BatchedExecutor Fixed Forking by @martindevans in #621
- Fixed off by one error in LLamaBatch sampling position by @martindevans in #626
- [LLama.KernelMemory] Fixed System.ArgumentException: EmbeddingMode must be true & #617 by @ChengYen-Tang in #615
- fix: the missing of llava_shared library. by @AsakusaRinne in #633
📌 Other Changes
- Removed
llama_eval()
by @martindevans in #553 - ChatSession: improve exception message by @swharden in #523
- Improve "embeddings" example by @swharden in #525
- Add path to find llama.dll for MAUI by @evolcano in #631
- LLama.Examples: improve model path prompt by @swharden in #526
- NativeLibraryConfig.WithLogs() overload to set log level by @swharden in #529
- LLamaSharp.Examples: Document Q&A with local storage by @swharden in #532
- Used
AnsiConsole
in a few more places by @martindevans in #534 ReadOnlySpan<float>
in ISamplingPipeline by @martindevans in #538- KernelMemory update with adding the use of already loaded model by @zsogitbe in #630
- Add Link To Blazor Demo by @alexhiggins732 in #539
- Removed Obsolete SamplingApi by @martindevans in #552
- update readme.md backends by @warquys in #587
- docs: update the example in readme. by @AsakusaRinne in #604
- Update Semantic Kernel & Kernel Memory Package by @xbotter in #612
BatchedExecutor.Create()
method by @martindevans in #613- LLamaBatch Logit Tracking by @martindevans in #624
🙌 New Contributors
- @swharden made their first contribution in #523
- @alexhiggins732 made their first contribution in #539
- @clovisribeiro made their first contribution in #593
- @warquys made their first contribution in #587
- @eublefar made their first contribution in #560
- @ChengYen-Tang made their first contribution in #615
- @evolcano made their first contribution in #631
Full Changelog: v0.10.0...0.11.0
0.10.0 - Phi2
Major Changes
- Update binaries feb 2024 by @martindevans in #479
- Add CLBLAST native library to native libraries build by @jasoncouture in #468
- Introduced a new
BatchedExecutor
by @martindevans in #503 - Swapped
StatelessExecutor
to usellama_decode
! by @martindevans in #445 - LLamaToken Struct by @martindevans in #404
Bug Fixes
- KernelMemory EmbeddingMode bug correction by @zsogitbe in #485
- Normalize Embeddings by @martindevans in #507
- StreamingTextDecoder Fix & Tests by @martindevans in #428
- Tokenizer Fixes For Issue 430 by @martindevans in #433
Other Changes
- Use llama instead of libllama in
[DllImport]
by @jasoncouture in #465 - Updated Examples by @vikramvee in #502
- Added new file types to quantisation by @martindevans in #495
- Smaller Unit Test Model by @martindevans in #496
- Using
AddRange
inLLamaEmbedder
by @martindevans in #499 - Small KV Cache Handling Improvements by @martindevans in #500
- Added increment and decrement operators to
LLamaPos
by @martindevans in #501 - Swapped
GetEmbeddings
tollama_decode
by @martindevans in #474 - kv_cache_instance_methods by @martindevans in #454
- Removed
IModelParams
andIContextParams
setters. by @martindevans in #472 - Managed
LLamaBatch
by @martindevans in #442 - Check Model Path Exists by @martindevans in #437
- Model Metadata Loading Cleanup by @martindevans in #438
- Added a check for EOS token in LLamaStatelessExecutor by @martindevans in #434
- Update README.md by @Oceania2018 in #427
- Gpu layer count change by @Kaotic3 in #424
- Improved exceptions in IModelParams for unknown KV override types. by @martindevans in #416
New Contributors
- @Kaotic3 made their first contribution in #424
- @Oceania2018 made their first contribution in #427
- @jasoncouture made their first contribution in #465
- @zsogitbe made their first contribution in #485
- @vikramvee made their first contribution in #502
Full Changelog: 0.9.1...v0.10.0
0.9.1 - Mixtral!
Major Changes
- Rebuilt ChatSession class by @philippjbauer in #344
- Custom Sampling Pipelines by @martindevans in #348
- Updated Binaries December 2023 by @martindevans in #361
- Added
LLamaWeights.Metadata
property by @martindevans in #380
Bug Fixes
- Fix documentation to reflect changes in ChatSession API by @asmirnov82 in #366
- Added missing field to LLamaModelQuantizeParams by @martindevans in #367
- Fix broken references in docs by @asmirnov82 in #378
- Updated & Fixed WebAPI by @scotmcc in #377
- Fixed loading of very large metadata values by @martindevans in #384
- Update compile.yml to fix not building for windows by @edgett in #386
- Metadata Fixes by @martindevans in #385
- Fix typos in SemanticKernel README file by @asmirnov82 in #408
Other Changes
- Context Set Seed by @martindevans in #368
- Update README.md by @martindevans in #335
- ci: fix error in auto-release. by @AsakusaRinne in #334
- Update README.md by @markvantilburg in #339
- 🔧 Refactor Semantic Kernel chat completion implementation by @xbotter in #341
- build(deps): bump xunit.runner.visualstudio from 2.5.4 to 2.5.5 by @dependabot in #353
- build(deps): bump xunit from 2.6.2 to 2.6.3 by @dependabot in #352
- Added AVX and AVX2 to MacOS x86_64 builds by @martindevans in #360
- Upgrade unittest target framework to .NET 8.0 by @xbotter in #358
- Clone Grammar by @martindevans in #370
- Renamed
llama_sample_temperature
tollama_sample_temp
by @martindevans in #369 - Reset Custom Sampling Pipeline by @martindevans in #372
- Improved support for AVX512 by @martindevans in #373
- bump sk to 1.0.1 & km to 0.18 by @xbotter in #356
- build(deps): bump xunit from 2.6.3 to 2.6.4 by @dependabot in #389
- build(deps): bump xunit.runner.visualstudio from 2.5.5 to 2.5.6 by @dependabot in #391
- build(deps): bump Swashbuckle.AspNetCore from 6.4.0 to 6.5.0 by @dependabot in #388
- build(deps): bump Microsoft.KernelMemory.Abstractions from 0.18.231209.1-preview to 0.24.231228.5 by @dependabot in #397
- build(deps): bump Microsoft.KernelMemory.Core and Microsoft.KernelMemory.Abstractions by @dependabot in #396
- Code cleanup driven by R# suggestions by @martindevans in #400
- Removed some unnecessary uses of
unsafe
by @martindevans in #401 - Safer Model Handle Creation by @martindevans in #402
- Extra ModelParams Checking by @martindevans in #403
New Contributors
- @markvantilburg made their first contribution in #339
- @asmirnov82 made their first contribution in #366
- @scotmcc made their first contribution in #377
- @edgett made their first contribution in #386
Thank you so much for all the contributions! 😻
Full Changelog: v0.8.1...0.9.1
v0.8.1 - Major BUG fix and better feature detection
Break changes
- Change
NativeLibraryConfig.Default
toNativeLibraryConfig.Instance
.
Major features and fix
- MinP Sampler by @martindevans in #277
- CPU Feature Detection 2 by @martindevans in #281
- AntipromptProcessor access by @saddam213 in #288
- StreamingTextDecoder in LLamaExecutorBase by @martindevans in #293
- November Binary Update by @martindevans in #316
- Update KernelMemory Package by @xbotter in #325
- fix: Chinese encoding error with gb2312. by @AsakusaRinne in #326
- feat: allow customized search path for native library loading. by @AsakusaRinne in #333
Other changes
- Add targets in Web project by @SignalRT in #286
- Update examples by @xbotter in #295
- dotnet8.0 by @martindevans in #292
- progress_callback in
LLamaModelParams
by @martindevans in #303 - Added Obsolete markings to all
Eval
overloads by @martindevans in #304 - Improved test coverage. by @martindevans in #311
- Removed Obsolete ModelParams Constructor by @martindevans in #312
- Better TensorSplitsCollection Initialisation by @martindevans in #310
- Add DefaultInferenceParams to Kernel Memory by @xbotter in #307
- Added a converter similar to the Open AI one by @futzy314 in #315
New Contributors
Thank you so much for all the contributions!
Full Changelog: v0.8.0...v0.8.1
v0.8.0: performance improvement, cuda feature detection and kernel-memory integration
What's Changed
- fix: binary not copied on MAC platform. by @AsakusaRinne in #238
- docs: add related repos. by @AsakusaRinne in #240
- docs: add example models for v0.7.0. by @AsakusaRinne in #243
- Adapts to SK Kernel Memory by @xbotter in #226
- CodeQL Pointer Arithmetic by @martindevans in #246
- build(deps): bump xunit from 2.5.0 to 2.6.1 by @dependabot in #233
- build(deps): bump xunit.runner.visualstudio from 2.5.0 to 2.5.3 by @dependabot in #234
- build(deps): bump Swashbuckle.AspNetCore from 6.2.3 to 6.5.0 by @dependabot in #235
- build(deps): bump Microsoft.SemanticKernel from 1.0.0-beta1 to 1.0.0-beta4 by @dependabot in #231
- feat(kernel-memory): avoid loading model twice. by @AsakusaRinne in #248
- GitHub Action Pipeline Improvements by @martindevans in #245
- Update README.md by @hswlab in #252
- Removed some CI targets by @martindevans in #253
- Removed Old Targets From CI matrix by @martindevans in #254
- Align with llama.cpp b1488 by @SignalRT in #249
- Enhance framework compatibility by @Uralstech in #259
- Update LLama.Examples using Spectre.Console by @xbotter in #255
- Context Size Autodetect by @martindevans in #263
- Prevent duplication of user prompts / chat history in ChatSession. by @philippjbauer in #266
- build: add package for kernel-memory integration. by @AsakusaRinne in #244
- Exposed YaRN scaling parameters in IContextParams by @martindevans in #257
- Update ToLLamaSharpChatHistory extension method to be public and support semantic-kernel author roles by @kidkych in #274
- Runtime detection MacOS by @SignalRT in #258
- feat: cuda feature detection. by @AsakusaRinne in #275
New Contributors
- @dependabot made their first contribution in #233
- @hswlab made their first contribution in #252
- @Uralstech made their first contribution in #259
- @philippjbauer made their first contribution in #266
- @kidkych made their first contribution in #274
Full Changelog: v0.7.0...v0.8.0
v0.7.0 - improve performance
This release fixes the performance problem in v0.6.0, so that it's strongly recommended to upgraded to this version. Many thanks for the catch of this problem by @lexxsoft and the fix from @martindevans !
What's Changed
- RoundTrip Tokenization Errors by @martindevans in #205
- Fixed Broken Text Decoding by @martindevans in #219
- Multi GPU by @martindevans in #202
- New Binaries & Improved Sampling API by @martindevans in #223
Full Changelog: v0.6.0...v0.7.0
v0.6.0 - follow major llama.cpp changes
What's Changed
- Better Antiprompt Testing by @martindevans in #150
- Simplified
LLamaInteractExecutor
antiprompt matching by @martindevans in #152 - Changed
OpenOrCreate
toCreate
by @martindevans in #153 - Beam Search by @martindevans in #155
- ILogger implementation by @saddam213 in #158
- Removed
GenerateResult
by @martindevans in #159 GetState()
fix by @martindevans in #160- llama_get_kv_cache_token_count by @martindevans in #164
- better_instruct_antiprompt_checking by @martindevans in #165
- skip_empty_tokenization by @martindevans in #167
- SemanticKernel API Update by @drasticactions in #169
- Removed unused properties of
InferenceParams
&ModelParams
by @martindevans in #149 - Coding assistent example by @Regenhardt in #172
- Remove non-async by @martindevans in #173
- MacOS default build now is metal llama.cpp #2901 by @SignalRT in #163
- CPU Feature Detection by @martindevans in #65
- make InferenceParams a record so we can use
with
by @redthing1 in #175 - fix opaque GetState (fixes #176) by @redthing1 in #177
- Extensions Method Unit Tests by @martindevans in #179
- Async Stateless Executor by @martindevans in #182
- Fixed GitHub Action by @martindevans in #190
- GrammarRule Tests by @martindevans in #192
- More Tests by @martindevans in #194
- Support SemanticKernel 1.0.0-beta1 by @DVaughan in #193
- Major llama.cpp API Change by @martindevans in #185
- Cleanup by @martindevans in #196
- Update WebUI inline with v5.0.x by @saddam213 in #197
- More Logging by @martindevans in #198
- chore: Update LLama.Examples and LLama.SemanticKernel by @xbotter in #201
- ci: add auto release workflow. by @AsakusaRinne in #204
New Contributors
- @Regenhardt made their first contribution in #172
- @redthing1 made their first contribution in #175
- @DVaughan made their first contribution in #193
Full Changelog: v0.5.1...v0.6.0
v0.5.1 - GGUF, grammar and semantic-kernel integration
What's Changed
- Remove native libraries from LLama.csproj and replace it with a targets file. by @drasticactions in #32
- Update libllama.dylib by @SignalRT in #36
- update webapi example by @xbotter in #39
- MacOS metal support by @SignalRT in #47
- Basic ASP.NET Core website example by @saddam213 in #48
- fix breaking change in llama.cpp; bind to latest version llama.cpp to… by @fwaris in #51
- Documentation Spelling/Grammar by @martindevans in #52
- XML docs fixes by @martindevans in #53
- Cleaned up unnecessary extension methods by @martindevans in #55
- Memory Mapped LoadState/SaveState by @martindevans in #56
- Larger states by @martindevans in #57
- Instruct & Stateless web example implemented by @saddam213 in #59
- Fixed Multiple Enumeration by @martindevans in #54
- Fixed More Multiple Enumeration by @martindevans in #63
- Low level new loading system by @martindevans in #64
- Fixed Memory pinning in Sampling API by @martindevans in #68
- Fixed Spelling Mirostate -> Mirostat by @martindevans in #69
- Fixed Mirostate Sampling by @martindevans in #72
- GitHub actions by @martindevans in #74
- Update llama.cpp binaries to 5f631c2 and align the LlamaContext by @SignalRT in #77
- Expose some native classes by @saddam213 in #80
- feat: update the llama backends. by @AsakusaRinne in #78
- ModelParams & InferenceParams abstractions by @saddam213 in #79
- Cleaned up multiple enumeration in FixedSizeQueue by @martindevans in #83
- Improved Tensor Splits by @martindevans in #81
- fix: antiprompt does not work in stateless executor. by @AsakusaRinne in #84
- Access to IModelParamsExtensions by @saddam213 in #86
- Utils Cleanup by @martindevans in #82
- Fixed
ToLlamaContextParams
using the wrong parameter foruse_mmap
by @martindevans in #89 - Fix serialization error due to NaN by @martindevans in #88
- Add native logging output by @saddam213 in #95
- Minor quantizer improvements by @martindevans in #96
- Improved
NativeApi
file a bit by @martindevans in #99 - Logger Comments by @martindevans in #100
- llama_sample_classifier_free_guidance by @martindevans in #101
- Potential fix for .Net Framework issues by @zombieguy98 in #103
- Add missing semi-colon to README sample code by @zerosoup in #104
- Multi Context by @martindevans in #90
- Updated Demos by @martindevans in #105
- renamed some arguments in ModelParams constructor so that class can be serialized easily by @erinloy in #108
- Stateless Executor Fix by @martindevans in #107
- Grammar basics by @martindevans in #102
- Re-renaming some arguments to allow for easy deserialization from appsettings.json. by @erinloy in #111
- Added native symbol for CFG by @martindevans in #112
- Minor Code Cleanup by @martindevans in #114
- Changed type conversion by @zombieguy98 in #116
- OldVersion obsoletion notices by @martindevans in #117
- Embedder Test by @martindevans in #97
- Improved Cloning by @martindevans in #119
- ModelsParams record class by @martindevans in #115
- ReSharper code warnings cleanup by @martindevans in #120
- Two small improvements to the native sampling API by @martindevans in #124
- Removed unnecessary parameters from some low level sampler methods by @martindevans in #125
- Dependency Building In Github Action by @martindevans in #126
- Fixed paths by @martindevans in #127
- Fixed cuda paths again by @martindevans in #130
- Linux cublas by @martindevans in #131
- Fixed linux cublas filenames by @martindevans in #132
- fixed linux cublas paths in final step by @martindevans in #133
- Fixed the cublas linux paths again by @martindevans in #134
- Fixed those cublas paths again by @martindevans in #135
- Translating the grammar parser by @Mihaiii in #136
- Higher Level Grammar System by @martindevans in #137
- Enable Semantic kernel support by @drasticactions in #138
- grammar_exception_types by @martindevans in #140
- GGUF by @martindevans in #122
- docs: update the docs to follow new version. by @AsakusaRinne in #141
- Update MacOS Binaries by @SignalRT in #143
- Remove LLamaNewlineTokens from InteractiveExecutorState by @martindevans in #144
- refactor: remove old version files. by @AsakusaRinne in #142
- Disable test parallelism by @martindevans in #145
- Removed duplicate
llama_sample_classifier_free_guidance
method by @martindevans in #146 - Swapped to llama-7b-chat by @martindevans in #147
New Contributors
- @drasticactions made their first contribution in #32
- @xbotter made their first contribution in #39
- @saddam213 made their first contribution in #48
- @fwaris made their first contribution in #51
- @martindevans made their first contribution in #52
- @zombieguy98 made their first contribution in #103
- @zerosoup made their first contribution in #104
- @erinloy made their first contribution in #108
- @Mihaiii made their first contribution in #136
Full Changelog: v0.4.0...v0.5.0
v0.4.2-preview: new backends
What's Changed
- update webapi example by @xbotter in #39
- MacOS metal support by @SignalRT in #47
- Basic ASP.NET Core website example by @saddam213 in #48
- fix breaking change in llama.cpp; bind to latest version llama.cpp to… by @fwaris in #51
- Documentation Spelling/Grammar by @martindevans in #52
- XML docs fixes by @martindevans in #53
- Cleaned up unnecessary extension methods by @martindevans in #55
- Memory Mapped LoadState/SaveState by @martindevans in #56
- Larger states by @martindevans in #57
- Instruct & Stateless web example implemented by @saddam213 in #59
- Fixed Multiple Enumeration by @martindevans in #54
- Fixed More Multiple Enumeration by @martindevans in #63
- Low level new loading system by @martindevans in #64
- Fixed Memory pinning in Sampling API by @martindevans in #68
- Fixed Spelling Mirostate -> Mirostat by @martindevans in #69
- Fixed Mirostate Sampling by @martindevans in #72
- GitHub actions by @martindevans in #74
- Update llama.cpp binaries to 5f631c2 and align the LlamaContext by @SignalRT in #77
- Expose some native classes by @saddam213 in #80
- feat: update the llama backends. by @AsakusaRinne in #78
New Contributors
- @xbotter made their first contribution in #39
- @saddam213 made their first contribution in #48
- @fwaris made their first contribution in #51
- @martindevans made their first contribution in #52
Full Changelog: v0.4.1-preview...v0.4.2-preview
v0.4.1-preview - follow up llama.cpp latest commit
This is a preview version which followed up the latest modifications of llama.cpp.
For some reasons the cuda backend hasn't been okay, we'll release v0.4.1 after dealing with that.