Martin Evans
db8f3980ea
New binaries from this commit: 207b51900e
Should fix the extreme speed loss.
2 years ago
Martin Evans
b6d242193e
Debugging slowdown by removing some things:
- Removed all `record struct` uses in native code
- Removed usage of `readonly` in native structs
Minor fix:
- Added sequential layout to `LLamaModelQuantizeParams`
2 years ago
Martin Evans
529b06b35b
- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!
2 years ago
Martin Evans
dcc82e582e
Fixed `Eval` on platforms < dotnet 5
2 years ago
Martin Evans
51c292ebd8
Added a safe method for `llama_get_logits_ith`
2 years ago
Martin Evans
7e3cde4c13
Moved helper methods into `LLamaBatchSafeHandle`
2 years ago
Martin Evans
ccb8afae46
Cleaned up stateless executor as preparation for changing it to use the new batched decoding system.
2 years ago
Martin Evans
c786fb0ec8
Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams`
2 years ago
Martin Evans
c7fdb9712c
Added binaries, built from ` 6961c4bd0b`
2 years ago
Martin Evans
e81b3023d5
Rewritten sampling API to be accessed through the `LLamaTokenDataArray` object
2 years ago
Martin Evans
3c5547b2b7
Reduced some uses of `NativeApi` in `BatchedDecoding` by adding some helper methods
2 years ago
Martin Evans
b38e3f6fe2
binaries (avx512)
2 years ago
Martin Evans
a024d2242e
It works!
had to update binary to `b1426`
2 years ago
Martin Evans
8cd81251b4
initial setup
2 years ago
Martin Evans
321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
Multi GPU
2 years ago
Martin Evans
f6a472ae86
Setting the default seed to `0xFFFFFFFF` (no seed, randomised)
2 years ago
Martin Evans
36c71abcfb
Fixed `LLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLama` spam in all executors except Stateless.
2 years ago
Martin Evans
5b6408b072
Merge pull request #205 from martindevans/roundtrip_tokenization_investigation
RoundTrip Tokenization Errors
2 years ago
Martin Evans
a03fe003de
Fixed decoding of text "accumulating" over time (never properly clearing buffer)
2 years ago
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2 years ago
Martin Evans
efdf3d630c
- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
- Built a new (hacky) `Detokenize` method which handles this
2 years ago
Rinne
231efe06f2
Update LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2 years ago
Rinne
ecf852c4e2
Update LLama/runtimes/build/LLamaSharp.Backend.MacMetal.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2 years ago
Rinne
95669c2ea3
Update LLama/runtimes/build/LLamaSharp.Backend.Cuda12.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2 years ago
Rinne
5eaebd68ba
Update LLama/runtimes/build/LLamaSharp.Backend.Cuda11.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2 years ago
Rinne
6724b39713
Update LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2 years ago
Martin Evans
1d0620e634
Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters
2 years ago
Yaohui Liu
b7a7dc00b6
ci: fix typos.
2 years ago
Yaohui Liu
252992ec6e
ci: fix icon and typos.
2 years ago
Yaohui Liu
53eedf1428
ci: fix error.
2 years ago
Yaohui Liu
f9a98c6e23
ci: add auto release workflow.
2 years ago
Martin Evans
f621ec67e8
Fixed serialization
2 years ago
Martin Evans
768747c652
spelling
2 years ago
Martin Evans
b4e7f64e76
Added System.Text.Json serialization for `TensorSplitsCollectionConverter`
2 years ago
Martin Evans
281e58f059
Fixed default value
2 years ago
Martin Evans
04acbf8c42
Improved doc comment on `tensor_split`
2 years ago
Martin Evans
6a4cd506bd
Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection
2 years ago
Martin Evans
15db194c17
Added multi GPU support
2 years ago
Martin Evans
328022b13d
Fixed merge conflicts
2 years ago
Martin Evans
7ec318aab5
Added logging to embedder too
2 years ago
Martin Evans
f1e5a8f995
- Passing the `ILogger` through to every call of `CreateContext`
- Passing `ILogger` into executors
2 years ago
sa_ddam213
4ec9aed47a
Revert LLamasSharp project changes
2 years ago
sa_ddam213
b4b4000342
Merge branch 'master' into upstream_master
# Conflicts:
# LLama.Web/Common/ModelOptions.cs
# LLama.Web/Services/ConnectionSessionService.cs
# LLama/LLamaStatelessExecutor.cs
# LLama/LLamaWeights.cs
2 years ago
Martin Evans
e89ca5cc17
Fixed a few minor warnings
2 years ago
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2 years ago
Martin Evans
d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
Major llama.cpp API Change
2 years ago
Martin Evans
1f8c94e386
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )
2 years ago
Martin Evans
efb0664df0
- Added new binaries
- Fixed stateless executor out-of-context handling
- Fixed token tests
2 years ago
Martin Evans
b8f0eff080
- Added `GetCharCountImpl` tests, fixed handling of empty strings
- Added ifdef to remove `Deconstruct` extension on everything except `NETSTANDARD2_0`
2 years ago
Martin Evans
45118520fa
- Improved coverage of `GBNFGrammarParser` up to 96%
- Covered text transforms
- Removed unnecessary non-async transforms
2 years ago