Martin Evans
5da2a2f64b
- Removed one of the constructors of `SafeLLamaHandleBase`, which implicitly states that memory is owned. Better to be explicit about this kind of thing!
- Also fixed `ToString()` in `SafeLLamaHandleBase`
2 years ago
Jason Couture
ec59c5bf9e
Fix missing library name prefix for cuda
2 years ago
Jason Couture
443ce4fff4
While the dllimport changes work, manual path searching needed to be updated
2 years ago
Jason Couture
db7e1e88f8
Use llama instead of libllama in `[DllImport]`
This results in windows users not needing to rename the DLL. This allows native llama builds to be dropped in, even on windows.
I also took the time to update the documentation, removing references to renaming the files, since the names now match.
Fixes #463
2 years ago
Martin Evans
92b9bbe779
Added methods to `SafeLLamaContextHandle` for KV cache manipulation
2 years ago
Martin Evans
96c26c25f5
Merge pull request #445 from martindevans/stateless_executor_llama_decode
Swapped `StatelessExecutor` to use `llama_decode`!
2 years ago
Martin Evans
9fe878ae1f
- Fixed example
- Growing more than double, if necessary
2 years ago
Martin Evans
9ede1bedc2
Automatically growing batch n_seq_max when exceeded. This means no parameters need to be picked when the batch is created.
2 years ago
Martin Evans
a2e29d393c
Swapped `StatelessExecutor` to use `llama_decode`!
- Added `logits_i` argument to `Context.ApplyPenalty`
- Added a new exception type for `llama_decode` return code
2 years ago
Martin Evans
99969e538e
- Removed some unused `eval` methods.
- Added a `DecodeAsync` overload which runs the work in a task
- Replaced some `NativeHandle` usage in `BatchedDecoding` with higher level equivalents.
- Made the `LLamaBatch` grow when token capacity is exceeded, removing the need to manage token capacity externally.
2 years ago
Martin Evans
36a9335588
Removed `LLamaBatchSafeHandle` (using unmanaged memory, created by llama.cpp) and replaced it with a fully managed `LLamaBatch`. Modified the `BatchedDecoding` example to use new managed batch.
2 years ago
Martin Evans
1472704e12
Added a test with examples of troublesome strings from 0.9.1
2 years ago
Martin Evans
73172bbaba
Merge pull request #438 from martindevans/cleanup_model_unnecessary_unsafe
Model Metadata Loading Cleanup
2 years ago
Martin Evans
ce1d302e7e
Moved some native methods into `SafeLlamaModelHandle`, these methods are all wrapped in safer accessors with no extra costs so there is no need to expose them.
2 years ago
Martin Evans
1e86755071
- Removed unnecessary `unsafe` block in model metadata loading
- Clarified comments on native metadata loading methods
2 years ago
Martin Evans
de2b20aae5
- Added a specific exception for failing to load model weights.
- Checking if model is readable
2 years ago
Martin Evans
096e0e75f8
Check that the model file actually exists immediately before loading it. Improve #395
2 years ago
Martin Evans
2ea2048b78
- Added a test for tokenizing just a new line (reproduce issue https://github.com/SciSharp/LLamaSharp/issues/430 )
- Properly displaying `LLamaToken`
- Removed all tokenisation code in `SafeLLamaContextHandle` - just pass it all through to the `SafeLlamaModelHandle`
- Improved `SafeLlamaModelHandle` tokenisation:
- Renting an array, for one less allocation
- Not using `&tokens[0]` to take a pointer to an array, this is redundant and doesn't work on empty arrays
2 years ago
Martin Evans
98635a0d5a
Fixed decoding of large tokens (over 16 bytes) in streaming text decoder
2 years ago
Martin Evans
402a110a3a
Merge pull request #404 from martindevans/switched_to_LLamaToken_struct
LLamaToken Struct
2 years ago
Martin Evans
1e69e265b6
Moved some native methods to do with creating/destroying resources into their respective handles. There is **no** safe way to call most of these methods, everything must be done through through handles.
2 years ago
Martin Evans
82727c4414
Removed collection expressions from test
2 years ago
Martin Evans
2eb52b1630
made casts to/from int explicit, fixed places affected
2 years ago
Martin Evans
42be9b136d
Switched form using raw integers, to a `LLamaToken` struct
2 years ago
Martin Evans
4e5e994dda
- directly returning a SafeLlamaModelHandle, instead of an IntPtr which is wrapped in a handle.
- made `llama_backend_init` private. This is automatically called, there is no way it can correctly be used externally.
- made `llama_token_to_piece` safe (Span instead of pointer)
2 years ago
Martin Evans
bac3e43498
Fixed handling of empty spans
2 years ago
Martin Evans
c002642268
- Removed some `unsafe` where it wasn't necessary
- Wrapped some native functions which take (pointer, length) in function which take a `span` instead.
2 years ago
Martin Evans
f860f88c36
Code cleanup driven by R# suggestions:
- Made `NativeApi` into a `static class` (it's not intended to be instantiated)
- Moved `LLamaTokenType` enum out into a separate file
- Made `LLamaSeqId` and `LLamaPos` into `record struct`, convenient to have equality etc
2 years ago
Martin Evans
2cded1b296
Fixed alignment of value fields in `LLamaModelMetadataOverride`
2 years ago
Martin Evans
6be3f62321
Fixed loading of very large metadata values (over 1kb)
2 years ago
Martin Evans
fb606c2488
Fixed incorrect values
2 years ago
Martin Evans
47e4fcef2a
Fixed GetString on netstandard2
2 years ago
Martin Evans
2a1e1b6183
Removed unused imports
2 years ago
Martin Evans
a2bae178fa
Added a `Metadata` property to `LLamaWeights`
2 years ago
Martin Evans
1b13f7c717
Improved support for AVX512:
- Enabled more features in build process (VBMI and VNNI)
- Added runtime checking for this features
- Improved runtime checking to no longer require dotnet8.0
2 years ago
Martin Evans
c298ab828a
Merge pull request #368 from martindevans/context_set_seed
Context Set Seed
2 years ago
Martin Evans
a3177ab140
Merge pull request #369 from martindevans/rename_llama_sample_temperature
Renamed `llama_sample_temperature` to `llama_sample_temp`
2 years ago
Martin Evans
db7ecf5a43
Added a method to create a clone of a grammar instance
2 years ago
Martin Evans
ea523d2e2a
Renamed `llama_sample_temperature` to `llama_sample_temp`, Mirroring the same change made in llama.cpp
2 years ago
Martin Evans
2df3e7617e
Added a method to set the RNG seed on the context
2 years ago
Martin Evans
cedef5e45a
Added the `pure` field to `LLamaModelQuantizeParams` (it's been added to llama.cpp)
2 years ago
Martin Evans
b868b056f7
Added metadata overrides to `IModelParams`
2 years ago
Martin Evans
b22d8b7495
- Added `GroupDisposable` to dispose a collection of items all together
- Renamed `LLamaModelKvOverride` to `LLamaModelMetadataOverride`
2 years ago
Martin Evans
5ad2cd1d3c
Added a comment on the type itself
2 years ago
Martin Evans
b0270b5788
Added comments on GGMLType
2 years ago
Martin Evans
b3e576608b
fixed safe handle
2 years ago
Martin Evans
bab6b65b61
Added a safe handle for LLamaKvCacheView
2 years ago
Martin Evans
439d14a061
Updated binaries:
- build run: https://github.com/SciSharp/LLamaSharp/actions/runs/7196891440
- commit: 9fb13f9584
2 years ago
Martin Evans
835958398c
- Removed the object wrappers and configurable pipeline, they can be better written in code.
- Added BaseSamplingPipeline which provides a base impl of `ISamplingPipeline`
- Added `DefaultSamplingPipeline` which mimics normal llama.cpp sampling
2 years ago
Martin Evans
33358124db
Initial pass at a new sampling pipeline
2 years ago