LLamaSharp

333 MB

Tree: 7b309d7bf6

Author	SHA1	Message	Date
Zoli Somogyi	7b309d7bf6	KernelMemory bug fix - cleanup nullable refs	2 years ago
Zoli Somogyi	e304481226	KernelMemory bug fix While using WithLLamaSharpDefaults it was not possible to dispose GPU memory because the embedding was not defined well. This PR fixes that issue and also fixes some small problems with not setting all important model parameters.	2 years ago
Lyrcaxis	f01c13ee54	Made special tokens included in prompts tokenize as intended (#677 )	2 years ago
Martin Evans	c325ac9127	April 2024 Binary Update (#662 ) * Updated binaries, using [this build](https://github.com/SciSharp/LLamaSharp/actions/runs/8654672719/job/23733195669) for llama.cpp commit `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`. - Added all new functions. - Moved some functions (e.g. `SafeLlamaModelHandle` specific functions) into `SafeLlamaModelHandle.cs` - Exposed tokens on `SafeLlamaModelHandle` and `LLamaWeights` through a `Tokens` property. As new special tokens are added in the future they can be added here. - Changed all token properties to return nullable tokens, to handle some models not having some tokens. - Fixed `DefaultSamplingPipeline` to handle no newline token in some models. * Moved native methods to more specific locations. - Context specific things have been moved into `SafeLLamaContextHandle.cs` and made private - they're exposed through C# properties and methods already. - Checking that GPU layer count is zero if GPU offload is not supported. - Moved methods for creating default structs (`llama_model_quantize_default_params` and `llama_context_default_params`) into relevant structs. * Removed exception if `GpuLayerCount > 0` when GPU is not supported. * - Added low level wrapper methods for new per-sequence state load/save in `SafeLLamaContextHandle` - Added high level wrapper methods (save/load with `State` object or memory mapped file) in `LLamaContext` - Moved native methods for per-sequence state load/save into `SafeLLamaContextHandle` * Added update and defrag methods for KV cache in `SafeLLamaContextHandle` * Updated submodule to `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7` * Passing the sequence ID when saving a single sequence state	2 years ago
Kenneth Tang	9e4109f774	Unable to load the model onto multiple GPUs (#617 )	2 years ago
Kenneth Tang	3fda708eaa	Fix System.ArgumentException: EmbeddingMode must be true	2 years ago
Martin Evans	c9c8cd0d62	- Swapped embeddings generator to use `llama_decode` - Modified `GetEmbeddings` method to be async	2 years ago
xbotter	211ce12bf5	LLamaEmbedder exposes the Context	2 years ago
xbotter	13a312b4ec	update sk to 1.0.0-rc3 & km to 0.18	2 years ago

9 Commits (7b309d7bf63fa0e124d8914d1410cfd12973753e)