10 Commits (54dab273cdb671e01da21e2b1d374bba767dc244)

Author SHA1 Message Date
  Martin Evans c325ac9127
April 2024 Binary Update (#662) 1 year ago
  Martin Evans f0b0bbcbb7
Mutable Logits (#586) 1 year ago
  Martin Evans 91a7967869
`ReadOnlySpan<float>` in ISamplingPipeline (#538) 1 year ago
  Martin Evans b0acecf080 Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix). 2 years ago
  Martin Evans 2eb52b1630 made casts to/from int explicit, fixed places affected 2 years ago
  Martin Evans 42be9b136d Switched form using raw integers, to a `LLamaToken` struct 2 years ago
  Martin Evans 835958398c - Removed the object wrappers and configurable pipeline, they can be better written in code. 2 years ago
  Martin Evans 3afc007499 - Added "protected" logits, instead of the awkward save/load mechanism 2 years ago
  Martin Evans b34f72a883 - Added `SamplingPipeline` to inference params which overrides all other options with an entirely custom pipeline. 2 years ago
  Martin Evans 33358124db Initial pass at a new sampling pipeline 2 years ago