LLamaSharp

Commit Graph

Author	SHA1	Message	Date
Scott W Harden	4c3077d0f0	ChatSession: improve exception message The original message contained the word "preceeded" which should be spelled as "preceded"	2 years ago
Martin Evans	c7d0dc915a	Assorted small changes to clean up some code warnings	2 years ago
Martin Evans	174f21a385	0.10.0	2 years ago
Martin Evans	d03c1a9201	Merge pull request #503 from martindevans/batched_executor_again Introduced a new `BatchedExecutor`	2 years ago
Martin Evans	d47b6afe4d	Normalizing embeddings in `LLamaEmbedder`. As is done in llama.cpp: `2891c8aa9a/examples/embedding/embedding.cpp (L92)`	2 years ago
Martin Evans	e9d9042576	Added `Divide` to `KvAccessor`	2 years ago
Martin Evans	1cc463b9b7	Added a finalizer to `BatchedExecutor`	2 years ago
Martin Evans	0c2cff0e1c	Added a Finalizer for `Conversation` in case it is not correctly disposed.	2 years ago
Martin Evans	949861a581	- Added a `Modify` method to `Conversation`. This grants temporary access to directly modify the KV cache. - Re-implmented `Rewind` as an extension method using `Modify` internally - Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling. - Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.	2 years ago
Martin Evans	b0acecf080	Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix). Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state. Added two new examples, demonstrating forking and rewinding.	2 years ago
Martin Evans	90915c5a99	Added increment and decrement operators to `LLamaPos`	2 years ago
Martin Evans	82c471eac4	Merge pull request #500 from martindevans/improved_kv_cache_methods Small KV Cache Handling Improvements	2 years ago
Martin Evans	c5146bac23	- Exposed KV debug view through `SafeLLamaContextHandle` - Added `KvCacheSequenceDivide` - Moved count tokens/cells methods to `SafeLLamaContextHandle`	2 years ago
Martin Evans	744758f110	Using `AddRange` in `LLamaEmbedder`	2 years ago
Martin Evans	c7103e86e4	Added new file types to quantisation	2 years ago
Martin Evans	17385e12b6	Merge pull request #479 from martindevans/update_binaries_feb_2024 Update binaries feb 2024	2 years ago
Martin Evans	bac40a3b7a	Added new binaries, from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/7792319886	2 years ago
Jason Couture	c963b051e2	Add nuspec for OpenCL (CLBLAST)	2 years ago
Martin Evans	765c697f77	Fixed number type	2 years ago
Martin Evans	b2e815d51e	Updated all binaries (from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/7746303349 )	2 years ago
Martin Evans	15a98b36d8	Updated everything to work with llama.cpp `ce32060198`	2 years ago
Martin Evans	c9c8cd0d62	- Swapped embeddings generator to use `llama_decode` - Modified `GetEmbeddings` method to be async	2 years ago
Martin Evans	22aba9a671	Merge pull request #473 from martindevans/base_handle_removed_constructor Removed `SafeLLamaHandleBase` Constructor	2 years ago
Martin Evans	5da2a2f64b	- Removed one of the constructors of `SafeLLamaHandleBase`, which implicitly states that memory is owned. Better to be explicit about this kind of thing! - Also fixed `ToString()` in `SafeLLamaHandleBase`	2 years ago
Martin Evans	9b995510d6	Removed all setters in `IModelParams` and `IContextParams`, allowing implementations to be immutable.	2 years ago
Jason Couture	ec59c5bf9e	Fix missing library name prefix for cuda	2 years ago
Jason Couture	443ce4fff4	While the dllimport changes work, manual path searching needed to be updated	2 years ago
Jason Couture	db7e1e88f8	Use llama instead of libllama in `[DllImport]` This results in windows users not needing to rename the DLL. This allows native llama builds to be dropped in, even on windows. I also took the time to update the documentation, removing references to renaming the files, since the names now match. Fixes #463	2 years ago
dependabot[bot]	d8eb817bf5	build(deps): bump System.Text.Json from 8.0.0 to 8.0.1 Bumps [System.Text.Json](https://github.com/dotnet/runtime) from 8.0.0 to 8.0.1. - [Release notes](https://github.com/dotnet/runtime/releases) - [Commits](https://github.com/dotnet/runtime/compare/v8.0.0...v8.0.1) --- updated-dependencies: - dependency-name: System.Text.Json dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2 years ago
Martin Evans	92b9bbe779	Added methods to `SafeLLamaContextHandle` for KV cache manipulation	2 years ago
Martin Evans	a690db5d3e	Fixed build error caused by extra unnecessary parameter	2 years ago
Martin Evans	96c26c25f5	Merge pull request #445 from martindevans/stateless_executor_llama_decode Swapped `StatelessExecutor` to use `llama_decode`!	2 years ago
Martin Evans	9fe878ae1f	- Fixed example - Growing more than double, if necessary	2 years ago
Martin Evans	9ede1bedc2	Automatically growing batch n_seq_max when exceeded. This means no parameters need to be picked when the batch is created.	2 years ago
Martin Evans	a2e29d393c	Swapped `StatelessExecutor` to use `llama_decode`! - Added `logits_i` argument to `Context.ApplyPenalty` - Added a new exception type for `llama_decode` return code	2 years ago
Martin Evans	5b6e82a594	Improved the BatchedDecoding demo: - using less `NativeHandle` - Using `StreamingTokenDecoder` instead of obsolete detokenize method	2 years ago
Martin Evans	99969e538e	- Removed some unused `eval` methods. - Added a `DecodeAsync` overload which runs the work in a task - Replaced some `NativeHandle` usage in `BatchedDecoding` with higher level equivalents. - Made the `LLamaBatch` grow when token capacity is exceeded, removing the need to manage token capacity externally.	2 years ago
Martin Evans	36a9335588	Removed `LLamaBatchSafeHandle` (using unmanaged memory, created by llama.cpp) and replaced it with a fully managed `LLamaBatch`. Modified the `BatchedDecoding` example to use new managed batch.	2 years ago
Martin Evans	1472704e12	Added a test with examples of troublesome strings from 0.9.1	2 years ago
Martin Evans	73172bbaba	Merge pull request #438 from martindevans/cleanup_model_unnecessary_unsafe Model Metadata Loading Cleanup	2 years ago
Martin Evans	ce1d302e7e	Moved some native methods into `SafeLlamaModelHandle`, these methods are all wrapped in safer accessors with no extra costs so there is no need to expose them.	2 years ago
Martin Evans	1e86755071	- Removed unnecessary `unsafe` block in model metadata loading - Clarified comments on native metadata loading methods	2 years ago
Martin Evans	de2b20aae5	- Added a specific exception for failing to load model weights. - Checking if model is readable	2 years ago
Martin Evans	096e0e75f8	Check that the model file actually exists immediately before loading it. Improve #395	2 years ago
Martin Evans	3c6af909dd	Merge pull request #434 from martindevans/stateless_eos_check Added a check for EOS token in LLamaStatelessExecutor	2 years ago
Martin Evans	f160fbd6d1	Added a check for EOS token in LLamaStatelessExecutor	2 years ago
Martin Evans	2ea2048b78	- Added a test for tokenizing just a new line (reproduce issue https://github.com/SciSharp/LLamaSharp/issues/430 ) - Properly displaying `LLamaToken` - Removed all tokenisation code in `SafeLLamaContextHandle` - just pass it all through to the `SafeLlamaModelHandle` - Improved `SafeLlamaModelHandle` tokenisation: - Renting an array, for one less allocation - Not using `&tokens[0]` to take a pointer to an array, this is redundant and doesn't work on empty arrays	2 years ago
Martin Evans	98635a0d5a	Fixed decoding of large tokens (over 16 bytes) in streaming text decoder	2 years ago
Martin Evans	402a110a3a	Merge pull request #404 from martindevans/switched_to_LLamaToken_struct LLamaToken Struct	2 years ago
Steven Kennedy	988f2fa302	Reverted Net8.0	2 years ago

1 2 3 4 5 ...

497 Commits (3d7bf4287c120f0c2f207ce58f997b4ee6411e97)