dora-rs/dora - dora - 开源协同云脑生态支撑系统

Commit Graph

Author	SHA1	Message	Date
Philipp Oppermann	ec4f229094	Forward more daemon log messages to CLI	11 months ago
haixuantao	3b6678bd73	Remove free disk space on ubuntu	10 months ago
Haixuan Xavier Tao	e19b534544	Bump pyo3 to 0.23 (#798 ) This PR makes it possible to bump couple of dependency up: - Arrow 0.54 - Pyo3 0.23 - ros2-client 0.8 The biggest blocker being pyo3 with a more restricting python class requiring to be sync.	10 months ago
haixuantao	c238ebe92e	Remove typo in kokoro	10 months ago
haixuantao	eb43245c66	Fix missconfigured cuda package	10 months ago
haixuantao	efe0cffe9d	Remove chrono pinning	10 months ago
Philipp Oppermann	ee15517dce	Make `ShmemChannel` `Sync` as required by pyo3 0.23	10 months ago
haixuantao	d3a48d7b17	Bump arrow version to latest version	10 months ago
Haixuan Xavier Tao	269d23e592	Pick place demo (#793 ) Demo of reachy doing a pick and place exercice.	10 months ago
haixuantao	2fe970fe63	Update README with latest development	10 months ago
haixuantao	b867785223	Improve readme documentation	10 months ago
Haixuan Xavier Tao	1ee216a2de	Add kokoro tts (#794 ) Kokoro TTS is a very fast TTS and a good replacement for outlets with less than 1sec inference on macOS. It is apache 2.0 TODO: - [x] Interruption of the audio.	10 months ago
haixuanTao	7d8186dacb	Adding a stop to avoid confusion between prompt	11 months ago
haixuanTao	47b24a3582	Minor refactoring	11 months ago
haixuanTao	18ef24e9e3	Minor refactoring	11 months ago
haixuanTao	4174fd7362	Add list of string as metadata	11 months ago
haixuantao	469c4ce77c	Add interruption within llm audio demo	10 months ago
haixuantao	2d12aa0f94	Fix kokoro demo with gguf based inference of qwen2.5 0.5B	10 months ago
haixuantao	6983c79f42	Adding speech-to-speech example	11 months ago
haixuantao	2c3595b6cf	Add kokoro-tts	11 months ago
Haixuan Xavier Tao	ece8853478	Pin chrono version (#797 ) Pin chrono version to fix CI/CD bug related to: https://github.com/apache/arrow-rs/issues/7196#issuecomment-2688702335	10 months ago
haixuantao	0edbcf2505	Pin chrono version	10 months ago
Philipp Oppermann	f8581de1f8	Use zenoh for inter-daemon communication (#779 ) * Use `zenoh` for inter-daemon messaging * Report error context from coordinator to CLI * Assign unique ID to each daemon The previous machine ID is still used, but optional. Users don't need to ensure that the chosen machine IDs are unique anymore because they are augmented with a UUID. * Rework default daemon selection * Subscribe to required outputs of remote nodes when spawning dataflow * Don't check dataflow in coordinator * Log when sending AllNodesReady message * Add more logging in daemon * Improve log output * Fix node stopping when error occurs during spawn * Fix: Record node result when spawn fails * Trim additional newlines at end of `stderr` * Fix typos: deamon -> daemon * Log publish and subscribe topics * Fix: Calculate mapping info first, then set up subscriptions We need to loop over the nodes twice because the subscription setup requires full mapping info. Otherwise some subscriptions might be missing when we process the target node before the source node. * Fix `multiple-daemons` example * CI: Use `paths-ignore` instead of negated `paths` (#781) Use `paths-ignore` instead of negated `paths` The GitHub workflow docs specify: 'If you define a path with the ! character, you must also define at least one path without the ! character. If you only want to exclude paths, use paths-ignore instead.' * Add README with run instructions to multiple-daemons example * Add distributed deployment instructions to README for multiple-daemons example * Add config option to publish all outputs to zenoh This can be useful for debugging * Load zenoh config from `ZENOH_CONFIG` env variable if set * Remove commented-out check * Exit daemon on ctrl-c after all dataflows are stopped Before we only stopped all dataflows, but then kept on running. * Properly disconnect daemons that exited in coordinator Send an exit message from the daemon to the coordinator on exit. This enables the coordinator to disconnect the daemon properly instead of waiting for a missed heartbeat signal. * Don't error if local listen port is already in use * Error on daemon registration if machine ID is not unique * Set default log level to INFO for coordinator and daemon but keep zenoh log level at WARN level * Debug: Print error kind if daemon listener spawning fails	11 months ago
Philipp Oppermann	68d09a1c0c	Merge branch 'main' into zenoh-daemon	11 months ago
Philipp Oppermann	39519c5ed6	Debug: Print error kind if daemon listener spawning fails	11 months ago
Haixuan Xavier Tao	b9b8ced620	Adding a test for checking on the latency when used timeout and queue at the same time (#783 ) This PR is a follow of issue https://github.com/dora-rs/dora/issues/774 and indeed it seems that the latency is higher than what we would expect.	11 months ago
Haixuan Xavier Tao	33ba7a37f6	Add uv flag within start cli command (#788 ) This allows using up and start with uv: ## How to use: ```bash cd examples/python-dataflow dora up dora start dataflow.yaml --uv ``` Fixes: https://github.com/dora-rs/dora/issues/780 This would help the CI as well	11 months ago
haixuantao	3fca6f9e2c	Add time limit for test of 10s	11 months ago
Haixuan Xavier Tao	b4021d0c9a	Limit pip release ci to strict minimum (#791 ) In order to accelerate our workflow, this PR reduces the CI pip release cross platform check to work that correspond to python dependent work.	11 months ago
Haixuan Xavier Tao	d0cefcbe20	Adding float for env variable and metadata parameters (#786 ) This PR adds support for float64, Listfloat metadata parameters as well as float64 environment variables.	11 months ago
haixuantao	f33ee1e811	Fix scheduler not being used when called through node timeout	11 months ago
Haixuan Xavier Tao	be82fc9407	Fix bounding box for rerun viewer and clear the viewer if no bounding box is detected (#787 ) This PR fixes an issue that bounding box are not cleared when there is nothing detected.	11 months ago
Haixuan Xavier Tao	cf88bcd8fa	Update dependency transformers to >=4.48.0,<=4.48.0 [SECURITY] - abandoned (#778 ) This PR contains the following updates: \| Package \| Change \| Age \| Adoption \| Passing \| Confidence \| \|---\|---\|---\|---\|---\|---\| \| [transformers](https://redirect.github.com/huggingface/transformers) \| `>=4.43.0,<=4.43.3` -> `>=4.48.0,<=4.48.0` \| [![age](https://developer.mend.io/api/mc/badges/age/pypi/transformers/4.48.0?slim=true)](https://docs.renovatebot.com/merge-confidence/) \| [![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/transformers/4.48.0?slim=true)](https://docs.renovatebot.com/merge-confidence/) \| [![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/transformers/4.43.0/4.48.0?slim=true)](https://docs.renovatebot.com/merge-confidence/) \| [![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/transformers/4.43.0/4.48.0?slim=true)](https://docs.renovatebot.com/merge-confidence/) \| ### GitHub Vulnerability Alerts #### [CVE-2024-11392](https://nvd.nist.gov/vuln/detail/CVE-2024-11392) Hugging Face Transformers MobileViTV2 Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. The specific flaw exists within the handling of configuration files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-24322. #### [CVE-2024-11394](https://nvd.nist.gov/vuln/detail/CVE-2024-11394) Hugging Face Transformers Trax Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. The specific flaw exists within the handling of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-25012. #### [CVE-2024-11393](https://nvd.nist.gov/vuln/detail/CVE-2024-11393) Hugging Face Transformers MaskFormer Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. The specific flaw exists within the parsing of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-25191. --- ### Release Notes <details> <summary>huggingface/transformers (transformers)</summary> ### [`v4.48.0`](https://redirect.github.com/huggingface/transformers/releases/tag/v4.48.0): : ModernBERT, Aria, TimmWrapper, ColPali, Falcon3, Bamba, VitPose, DinoV2 w/ Registers, Emu3, Cohere v2, TextNet, DiffLlama, PixtralLarge, Moonshine [Compare Source](https://redirect.github.com/huggingface/transformers/compare/v4.47.1...v4.48.0) #### New models ##### ModernBERT The ModernBert model was proposed in [Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference](https://arxiv.org/abs/2412.13663) by Benjamin Warner, Antoine Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Galalgher, Raja Bisas, Faisal Ladhak, Tom Aarsen, Nathan Cooper, Grifin Adams, Jeremy Howard and Iacopo Poli. It is a refresh of the traditional encoder architecture, as used in previous models such as [BERT](https://huggingface.co/docs/transformers/en/model_doc/bert) and [RoBERTa](https://huggingface.co/docs/transformers/en/model_doc/roberta). It builds on BERT and implements many modern architectural improvements which have been developed since its original release, such as: - [Rotary Positional Embeddings](https://huggingface.co/blog/designing-positional-encoding) to support sequences of up to 8192 tokens. - [Unpadding](https://arxiv.org/abs/2208.08124) to ensure no compute is wasted on padding tokens, speeding up processing time for batches with mixed-length sequences. - [GeGLU](https://arxiv.org/abs/2002.05202) Replacing the original MLP layers with GeGLU layers, shown to improve performance. - [Alternating Attention](https://arxiv.org/abs/2004.05150v2) where most attention layers employ a sliding window of 128 tokens, with Global Attention only used every 3 layers. - [Flash Attention](https://redirect.github.com/Dao-AILab/flash-attention) to speed up processing. - A model designed following recent [The Case for Co-Designing Model Architectures with Hardware](https://arxiv.org/abs/2401.14489), ensuring maximum efficiency across inference GPUs. - Modern training data scales (2 trillion tokens) and mixtures (including code ande math data) ![image](https://redirect.github.com/user-attachments/assets/4256c0b1-9b40-4d71-ac42-fc94827d5e9d) - Add ModernBERT to Transformers by [@warner-benjamin](https://redirect.github.com/warner-benjamin) in [#35158](https://redirect.github.com/huggingface/transformers/issues/35158) ##### Aria The Aria model was proposed in [Aria: An Open Multimodal Native Mixture-of-Experts Model](https://huggingface.co/papers/2410.05993) by Li et al. from the Rhymes.AI team. Aria is an open multimodal-native model with best-in-class performance across a wide range of multimodal, language, and coding tasks. It has a Mixture-of-Experts architecture, with respectively 3.9B and 3.5B activated parameters per visual token and text token. - Add Aria by [@aymeric-roucher](https://redirect.github.com/aymeric-roucher) in [#34157](https://redirect.github.com/huggingface/transformers/issues/34157) ![image](https://redirect.github.com/user-attachments/assets/ef41fcc9-2c5f-4a75-ab1a-438f73d3d7e2) ##### TimmWrapper We add a `TimmWrapper` set of classes such that timm models can be loaded in as transformer models into the library. Here's a general usage example: ```py import torch from urllib.request import urlopen from PIL import Image from transformers import AutoConfig, AutoModelForImageClassification, AutoImageProcessor checkpoint = "timm/resnet50.a1_in1k" img = Image.open(urlopen( 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png' )) image_processor = AutoImageProcessor.from_pretrained(checkpoint) inputs = image_processor(img, return_tensors="pt") model = AutoModelForImageClassification.from_pretrained(checkpoint) with torch.no_grad(): logits = model(*inputs).logits top5_probabilities, top5_class_indices = torch.topk(logits.softmax(dim=1) 100, k=5) ``` Thanks to this, timm models now have access to pipelines, as well as `Trainer`, accelerate device maps, quantization, etc: ```py import torch from urllib.request import urlopen from PIL import Image from transformers import pipeline img = Image.open(urlopen( 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png' )) pipe = pipeline("image-classification", model="timm/resnet18.a1_in1k") print(pipe(img)) ``` - Add TimmWrapper by [@qubvel](https://redirect.github.com/qubvel) and [@amyeroberts](https://redirect.github.com/amyeroberts) in [#34564](https://redirect.github.com/huggingface/transformers/issues/34564) ##### Pixtral-Large Pixtral modeling and checkpoint conversion code has been updated to support the new [Pixtral-Large](https://huggingface.co/mistralai/Pixtral-Large-Instruct-2411) model. - Update Pixtral conversion script to support large format! by [@arthurzucker](https://redirect.github.com/arthurzucker) in [#34801](https://redirect.github.com/huggingface/transformers/issues/34801) ##### ColPali The ColPali model was proposed in [ColPali: Efficient Document Retrieval with Vision Language Models](https://doi.org/10.48550/arXiv.2407.01449) by Manuel Faysse\, Hugues Sibille\, Tony Wu\, Bilel Omrani, Gautier Viaud, Céline Hudelot, Pierre Colombo (\ denotes equal contribution). Work lead by ILLUIN Technology. In the proposed ColPali approach, the authors leverage VLMs to construct efficient multi-vector embeddings directly from document images (“screenshots”) for document retrieval. They train the model to maximize the similarity between these document embeddings and the corresponding query embeddings, using the late interaction method introduced in ColBERT. ![colpali_architecture](https://redirect.github.com/user-attachments/assets/545ed1d7-ea82-4d0d-80c1-4fcbb1c828cd) - Add ColPali to 🤗 transformers by [@tonywu71](https://redirect.github.com/tonywu71) and [@yonigozlan](https://redirect.github.com/yonigozlan) in [#33736](https://redirect.github.com/huggingface/transformers/issues/33736) ##### Falcon3 Falcon3 represents a natural evolution from previous releases, emphasizing expanding the models’ science, math, and code capabilities. This iteration includes five base models: Falcon3-1B-Base, Falcon3-3B-Base, Falcon3-Mamba-7B-Base, Falcon3-7B-Base, and Falcon3-10B-Base. In developing these models, the authors incorporated several key innovations aimed at improving the models’ performances while reducing training costs: One pre-training: They conducted a single large-scale pretraining run on the 7B model, using 2048 H100 GPU chips, leveraging 14 trillion tokens featuring web, code, STEM, and curated high-quality and multilingual data. Depth up-scaling for improved reasoning: Building on recent studies on the effects of model depth, they upscaled the 7B model to a 10B parameters model by duplicating the redundant layers and continuing pre-training with 2TT of high-quality data. This yielded Falcon3-10B-Base which achieves state-of-the-art zero-shot and few-shot performance for models under 13B parameters. Knowledge distillation for better tiny models: To provide compact and efficient alternatives, we developed Falcon3-1B-Base and Falcon3-3B-Base by leveraging pruning and knowledge distillation techniques, using less than 100GT of curated high-quality data, thereby redefining pre-training efficiency. - Add Falcon3 documentation by [@mokeddembillel](https://redirect.github.com/mokeddembillel) in [#35307](https://redirect.github.com/huggingface/transformers/issues/35307) ##### Bamba Bamba-9B is a decoder-only language model based on the [Mamba-2](https://redirect.github.com/state-spaces/mamba) architecture and is designed to handle a wide range of text generation tasks. It is trained from scratch using a two-stage training approach. In the first stage, the model is trained on 2 trillion tokens from the Dolma v1.7 dataset. In the second stage, it undergoes additional training on 200 billion tokens, leveraging a carefully curated blend of high-quality data to further refine its performance and enhance output quality. Checkout all Bamba-9B model checkpoints [here](https://redirect.github.com/foundation-model-stack/bamba). - Add the Bamba Model by [@fabianlim](https://redirect.github.com/fabianlim) in [#34982](https://redirect.github.com/huggingface/transformers/issues/34982) ##### VitPose ViTPose is a state-of-the-art vision transformer-based model for human pose estimation, introduced by Yufei Xu, Jing Zhang, Qiming Zhang, and Dacheng Tao in ["ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation”](https://arxiv.org/abs/2204.12484). The model leverages the capabilities of vision transformers to accurately predict 2D human keypoints. Adopting a top-down approach, ViTPose estimates keypoints locations for each detected person, allowing it to be easily used with any object detection model. ![vitpose](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/model_doc/vitpose-architecture.png) - Add VitPose by [@SangbumChoi](https://redirect.github.com/SangbumChoi) and [@NielsRogge](https://redirect.github.com/NielsRogge) in [#30530](https://redirect.github.com/huggingface/transformers/issues/30530) ##### DINOv2 with registers The DINOv2 with Registers model was proposed in [Vision Transformers Need Registers](https://arxiv.org/abs/2309.16588) by Timothée Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski. The [Vision Transformer](https://huggingface.co/docs/transformers/main/en/model_doc/vit) (ViT) is a transformer encoder model (BERT-like) originally introduced to do supervised image classification on ImageNet. Next, people figured out ways to make ViT work really well on self-supervised image feature extraction (i.e. learning meaningful features, also called embeddings) on images without requiring any labels. Some example papers here include [DINOv2](https://huggingface.co/docs/transformers/main/en/model_doc/dinov2) and [MAE](https://huggingface.co/docs/transformers/main/en/model_doc/vit_mae). The authors of DINOv2 noticed that ViTs have artifacts in attention maps. It’s due to the model using some image patches as “registers”. The authors propose a fix: just add some new tokens (called “register” tokens), which you only use during pre-training (and throw away afterwards). This results in: - no artifacts - interpretable attention maps - and improved performances. <!----> - Add DINOv2 with registers by [@NielsRogge](https://redirect.github.com/NielsRogge) in [#35348](https://redirect.github.com/huggingface/transformers/issues/35348) ##### Emu3 The Emu3 model was proposed in [Emu3: Next-Token Prediction is All You Need](https://arxiv.org/abs/2409.18869) by Xinlong Wang, Xiaosong Zhang, Zhengxiong Luo, Quan Sun, Yufeng Cui, Jinsheng Wang, Fan Zhang, Yueze Wang, Zhen Li, Qiying Yu, Yingli Zhao, Yulong Ao, Xuebin Min, Tao Li, Boya Wu, Bo Zhao, Bowen Zhang, Liangdong Wang, Guang Liu, Zheqi He, Xi Yang, Jingjing Liu, Yonghua Lin, Tiejun Huang, Zhongyuan Wang. Emu3 sets a new standard in multimodal AI by using next-token prediction to handle images, text, and videos. It simplifies multimodal modeling by tokenizing all data into a unified format and training a single transformer. Visual data is tokenized using vector quantization methods based on [VQ-VAE](https://arxiv.org/abs/1711.00937) model. Discretized visual tokens are later fused with text token ids for image and text generation. Emu3 outperforms leading models like SDXL and LLaVA-1.6 in both generation and perception tasks, without relying on diffusion or compositional methods.. - Add Emu3 by [@zucchini-nlp](https://redirect.github.com/zucchini-nlp) in [#33770](https://redirect.github.com/huggingface/transformers/issues/33770) ##### Cohere2 A new Cohere update was added through a new "Cohere2" set of classes. - Add Cohere2 model by [@alexrs-cohere](https://redirect.github.com/alexrs-cohere) in [#35224](https://redirect.github.com/huggingface/transformers/issues/35224) ##### TextNet [TextNet](https://arxiv.org/abs/2111.02394) is a lightweight and efficient architecture designed specifically for text detection, offering superior performance compared to traditional models like MobileNetV3. With variants TextNet-T, TextNet-S, and TextNet-B (6.8M, 8.0M, and 8.9M parameters respectively), it achieves an excellent balance between accuracy and inference speed. - Add TextNet by [@jadechoghari](https://redirect.github.com/jadechoghari) in [#34979](https://redirect.github.com/huggingface/transformers/issues/34979) ##### DiffLlama [Differential Transformer](https://arxiv.org/abs/2410.05258) combines the Llama architecture with Differential Transformer's Attention. - Add DiffLllama by [@weak-kajuma](https://redirect.github.com/weak-kajuma) in [#34083](https://redirect.github.com/huggingface/transformers/issues/34083) ##### PixtralLarge The conversion script needed a few update, while the modeling code was barely changed! - \[PixtralLarge] Update Pixtral conversion script to support large format! ([#34801](https://redirect.github.com/huggingface/transformers/issues/34801)) ##### Moonshine Moonshine is an autoregressive speech recognition encoder-decoder model that improves upon Whisper's architecture. Namely, it replaces absolute position embeddings with Rotary Position Embeddings (RoPE). This allows Moonshine to handle audio inputs of any length, unlike Whisper, which is restricted to fixed 30-second windows. It was introduced by Nat Jeffries, Evan King, Manjunath Kudlur, Guy Nicholson, James Wang, and Pete Warden in [Moonshine: Speech Recognition for Live Transcription and Voice Commands ](https://arxiv.org/abs/2410.15608). - Add Moonshine by [@eustlb](https://redirect.github.com/eustlb) in [#34784](https://redirect.github.com/huggingface/transformers/issues/34784) #### Quantization methods ##### VPTQ Quantization From the VPTQ contributors: > VPTQ is a novel Post-Training Quantization method that leverages Vector Quantization to high accuracy on LLMs at an extremely low bit-width (<2-bit). VPTQ can compress 70B, even the 405B model, to 1-2 bits without retraining and maintain high accuracy.. More details here: https://github.com/microsoft/vptq - FEAT : Adding VPTQ quantization method to HFQuantizer by [@wejoncy](https://redirect.github.com/wejoncy) in [#34770](https://redirect.github.com/huggingface/transformers/issues/34770) ##### HIGGS Quantization From the contributors: > HIGGS is a new 0-shot quantization algorithm that combines Hadamard preprocessing with MSE-Optimal quantization grids to achieve lower quantization error and SOTA performance. You can find more information in the [paper](https://arxiv.org/abs/2411.17525). > > Runtime support for HIGGS is implemented through [FLUTE](https://arxiv.org/abs/2407.10960), and its [library](https://redirect.github.com/HanGuo97/flute?tab=readme-ov-file). > > This PR adds support for HIGGS+FLUTE into transformers allowing for low-error 0-shot quantization and fast LLM inference. - HIGGS Quantization Support by [@BlackSamorez](https://redirect.github.com/BlackSamorez) in [#34997](https://redirect.github.com/huggingface/transformers/issues/34997) #### Cleanup We merged a cleanup for vision language models, to make sure it all models are standardized. - VLMs: major clean up 🧼 ([#34502](https://redirect.github.com/huggingface/transformers/issues/34502)) #### Breaking changes ##### Conversion scripts Many models in Transformers include scripts to convert the original model checkpoints into a Transformers-compatible format. These scripts can be found in the repo using the glob pattern `models/*/convert_.py`. They were a recurring source of vulnerability reports and CVEs because many models were originally released using insecure formats like older PyTorch `.bin` weights or `pickle` files. The conversion scripts had to open these formats, and this meant that they were vulnerable to maliciously crafted inputs. In practice, we do not see this as a serious vulnerability. The conversion scripts are never imported or called by the rest of the library; each script is standalone, and so the only way to exploit the vulnerability is to create a malicious checkpoint, induce a user to download it, and then also induce them to manually call a specific conversion script on it. However, even if there is little practical risk of an exploit, we are aware that open vulnerability reports create a compliance problem for users, and so beginning with this release we will be excluding these conversion scripts from release branches and wheels. They will remain accessible to developers on the `main` branch. - 🚨🚨🚨 Delete conversion scripts when making release wheels by [@Rocketknight1](https://redirect.github.com/Rocketknight1) in [#35296](https://redirect.github.com/huggingface/transformers/issues/35296) ##### Backtracking in Nougat A regular expression used within the Nougat code has been modified to ensure it does not hang. The method should output the same results but we cannot guarantee it; we recommend upgrading to the latest transformers if you use this model to ensure your code is performance-optimized. - 🚨🚨🚨 Limit backtracking in Nougat regexp by [@qubvel](https://redirect.github.com/qubvel) in [#35264](https://redirect.github.com/huggingface/transformers/issues/35264) ##### Whisper decoding This PR finalizes work that aimes to enable short-form (< 30 secs) and long-form generation using temperature fallback. It is a significant improvement to the whisper codebase, but it does result in the following breaking changes: ➡️ Previously:\ • Short-form: Returned a `ModelOutput` or `torch.LongTensor`, including decoder input IDs and the EOS token ID.\ • Long-form: Returned a `Dict` or `torch.LongTensor`, excluding decoder input IDs and the EOS token ID. ➡️ From now on:\ Short-form and long-form generation are now treated identically, meaning output differentiation based on these modes is no longer applicable. Decoder input IDs and EOS token IDs are never returned, except in two specific cases: when `return_dict_in_generate=True` and (`return_timestamps=False` or `force_unique_generate_call=True`). In this case, the output will be a `ModelOutput`, which is the result of the underlying call to GenerationMixin’s generate. Indeed, `return_timestamps=False` ensures no seeking occurs; only a single call to generate is made. Therefore, this output includes both decoder input IDs and the EOS token ID. - \[Whisper] 🚨 Fix whisper decoding 🚨 by [@eustlb](https://redirect.github.com/eustlb) in [#34135](https://redirect.github.com/huggingface/transformers/issues/34135) ##### Attention refactor In order to have a cleaner, isolated, future-proof code for the attention layers, they have been refactored so as to keep the model attention code within their files; but attention definitions relating to SDPA, Flash Attention, and other types of attention have been moved to a common file. - 🚨All attention refactor🚨 by [@ArthurZucker](https://redirect.github.com/ArthurZucker) in [#35235](https://redirect.github.com/huggingface/transformers/issues/35235) #### Bugfixes and improvements - \[tokenizers] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer ([#35593](https://redirect.github.com/huggingface/transformers/issues/35593)) - Setup loss_type in config at model init time ([#34616](https://redirect.github.com/huggingface/transformers/issues/34616)) - \[docs] Update Python version in translations by [@jla524](https://redirect.github.com/jla524) in [#35096](https://redirect.github.com/huggingface/transformers/issues/35096) - \[docs] top_p, top_k, temperature docstrings by [@stevhliu](https://redirect.github.com/stevhliu) in [#35065](https://redirect.github.com/huggingface/transformers/issues/35065) - Fix private forked repo. CI by [@ydshieh](https://redirect.github.com/ydshieh) in [#35114](https://redirect.github.com/huggingface/transformers/issues/35114) - Add feature dim attributes to BitLinear for easier PEFT integration by [@agostinv](https://redirect.github.com/agostinv) in [#34946](https://redirect.github.com/huggingface/transformers/issues/34946) - Update I-JEPA checkpoints path by [@qubvel](https://redirect.github.com/qubvel) in [#35120](https://redirect.github.com/huggingface/transformers/issues/35120) - Fix GA loss bugs and add unit test by [@techkang](https://redirect.github.com/techkang) in [#35121](https://redirect.github.com/huggingface/transformers/issues/35121) - \[I-JEPA] Update docs by [@NielsRogge](https://redirect.github.com/NielsRogge) in [#35148](https://redirect.github.com/huggingface/transformers/issues/35148) - Corrected typo in agent system prompts by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35143](https://redirect.github.com/huggingface/transformers/issues/35143) - Option to set 'non_blocking' for to(device) in BatchEncoding and BatchFeature by [@daniel-bogdoll](https://redirect.github.com/daniel-bogdoll) in [#34883](https://redirect.github.com/huggingface/transformers/issues/34883) - Fix typo in EETQ Tests by [@MekkCyber](https://redirect.github.com/MekkCyber) in [#35160](https://redirect.github.com/huggingface/transformers/issues/35160) - Cleanup: continue the init refactor by [@LysandreJik](https://redirect.github.com/LysandreJik) in [#35167](https://redirect.github.com/huggingface/transformers/issues/35167) - Super tiny fix logging message by [@fzyzcjy](https://redirect.github.com/fzyzcjy) in [#35132](https://redirect.github.com/huggingface/transformers/issues/35132) - Fixed typo of 'avilable' in prompts.py by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35145](https://redirect.github.com/huggingface/transformers/issues/35145) - \[CI] Fix bnb quantization tests with accelerate>=1.2.0 by [@matthewdouglas](https://redirect.github.com/matthewdouglas) in [#35172](https://redirect.github.com/huggingface/transformers/issues/35172) - Fix `num_items_in_batch` not being an integer by [@xspirus](https://redirect.github.com/xspirus) in [#35115](https://redirect.github.com/huggingface/transformers/issues/35115) - Assisted decoding multi-gpu by [@zucchini-nlp](https://redirect.github.com/zucchini-nlp) in [#35116](https://redirect.github.com/huggingface/transformers/issues/35116) - Fix file path for shard_num 1 with mllama converter by [@strangiato](https://redirect.github.com/strangiato) in [#35053](https://redirect.github.com/huggingface/transformers/issues/35053) - Support BatchNorm in Hubert pos_conv_emb as in fairseq by [@gallilmaimon](https://redirect.github.com/gallilmaimon) in [#34389](https://redirect.github.com/huggingface/transformers/issues/34389) - Remove unnecessary masked_fill in deberta models by [@xadupre](https://redirect.github.com/xadupre) in [#35182](https://redirect.github.com/huggingface/transformers/issues/35182) - Fix DBRX LayerNorm init method by [@hgt312](https://redirect.github.com/hgt312) in [#35177](https://redirect.github.com/huggingface/transformers/issues/35177) - Fixing GGUF support for StableLm by [@MekkCyber](https://redirect.github.com/MekkCyber) in [#35060](https://redirect.github.com/huggingface/transformers/issues/35060) - \[i18n-ar] Translated file : `docs/source/ar/community.md` into Arabic by [@AhmedAlmaghz](https://redirect.github.com/AhmedAlmaghz) in [#33027](https://redirect.github.com/huggingface/transformers/issues/33027) - Multiple typo fixes in NLP, Audio docs by [@henryhmko](https://redirect.github.com/henryhmko) in [#35181](https://redirect.github.com/huggingface/transformers/issues/35181) - Only import torch.distributed if it is available by [@GaetanLepage](https://redirect.github.com/GaetanLepage) in [#35133](https://redirect.github.com/huggingface/transformers/issues/35133) - \[i18n-<languageCode>] Translating Benchmarks.md to Chinese by [@asdkfjsd](https://redirect.github.com/asdkfjsd) in [#35137](https://redirect.github.com/huggingface/transformers/issues/35137) - \[docs] Fix FlashAttention link by [@stevhliu](https://redirect.github.com/stevhliu) in [#35171](https://redirect.github.com/huggingface/transformers/issues/35171) - Update data collator docstrings to accurately reference Nvidia tensor core compute capability version by [@johngrahamreynolds](https://redirect.github.com/johngrahamreynolds) in [#35188](https://redirect.github.com/huggingface/transformers/issues/35188) - \[i18n-<languageCode>] Translating agents.md to Chinese by [@HMJ0628](https://redirect.github.com/HMJ0628) in [#35139](https://redirect.github.com/huggingface/transformers/issues/35139) - BLIP: enable device map by [@zucchini-nlp](https://redirect.github.com/zucchini-nlp) in [#34850](https://redirect.github.com/huggingface/transformers/issues/34850) - 🧹 Remove deprecated RotaryEmbedding parts in the Attention layers by [@Cyrilvallez](https://redirect.github.com/Cyrilvallez) in [#34858](https://redirect.github.com/huggingface/transformers/issues/34858) - \[PEFT] Better Trainer error when prompt learning with loading best model at the end by [@BenjaminBossan](https://redirect.github.com/BenjaminBossan) in [#35087](https://redirect.github.com/huggingface/transformers/issues/35087) - Cleanup: continue the init refactor by [@LysandreJik](https://redirect.github.com/LysandreJik) in [#35170](https://redirect.github.com/huggingface/transformers/issues/35170) - Fix CI by [@Cyrilvallez](https://redirect.github.com/Cyrilvallez) in [#35208](https://redirect.github.com/huggingface/transformers/issues/35208) - Fix seamless TTS generate by [@ylacombe](https://redirect.github.com/ylacombe) in [#34968](https://redirect.github.com/huggingface/transformers/issues/34968) - docs: clarify initializer_range parameter description in Idefics3VisionConfig by [@h3110Fr13nd](https://redirect.github.com/h3110Fr13nd) in [#35215](https://redirect.github.com/huggingface/transformers/issues/35215) - Fixed typo of 'indentifier' in audio_utils.py by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35226](https://redirect.github.com/huggingface/transformers/issues/35226) - Fix type hints for apply_chat_template by [@Rocketknight1](https://redirect.github.com/Rocketknight1) in [#35216](https://redirect.github.com/huggingface/transformers/issues/35216) - Support Python 3.10+ Union style in chat template type hints parsing by [@RezaRahemtola](https://redirect.github.com/RezaRahemtola) in [#35103](https://redirect.github.com/huggingface/transformers/issues/35103) - Refactoring `AssistedCandidateGenerator` for Improved Modularity and Reusability by [@keyboardAnt](https://redirect.github.com/keyboardAnt) and [@jmamou](https://redirect.github.com/jmamou) in [#35009](https://redirect.github.com/huggingface/transformers/issues/35009) - Change back to `Thread` for SF conversion by [@ydshieh](https://redirect.github.com/ydshieh) in [#35236](https://redirect.github.com/huggingface/transformers/issues/35236) - \[Init refactor] Modular changes by [@LysandreJik](https://redirect.github.com/LysandreJik) in [#35240](https://redirect.github.com/huggingface/transformers/issues/35240) - Fix typo in chat template example by [@EricWinsorDSIT](https://redirect.github.com/EricWinsorDSIT) in [#35250](https://redirect.github.com/huggingface/transformers/issues/35250) - Run model as compressed/uncompressed mode by [@horheynm](https://redirect.github.com/horheynm) in [#34719](https://redirect.github.com/huggingface/transformers/issues/34719) - skip Fuyu from test_generate by [@nhamanasu](https://redirect.github.com/nhamanasu) in [#35246](https://redirect.github.com/huggingface/transformers/issues/35246) - \[tests] fix "Tester object has no attribute '\_testMethodName'" by [@faaany](https://redirect.github.com/faaany) in [#34910](https://redirect.github.com/huggingface/transformers/issues/34910) - Use `rsfE` with `pytest` by [@ydshieh](https://redirect.github.com/ydshieh) in [#35119](https://redirect.github.com/huggingface/transformers/issues/35119) - Update AMD docker image (rocm 6.1) by [@ivarflakstad](https://redirect.github.com/ivarflakstad) in [#35259](https://redirect.github.com/huggingface/transformers/issues/35259) - Fixed typos in Audio Classification Documentation by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35263](https://redirect.github.com/huggingface/transformers/issues/35263) - Translating agents_advanced.md to Chinese by [@HMJ0628](https://redirect.github.com/HMJ0628) in [#35231](https://redirect.github.com/huggingface/transformers/issues/35231) - Fix FSDP no longer working by [@muellerzr](https://redirect.github.com/muellerzr) in [#35212](https://redirect.github.com/huggingface/transformers/issues/35212) - don't use no_sync when deepspeed doesn't support it for certain zero stages by [@winglian](https://redirect.github.com/winglian) in [#35157](https://redirect.github.com/huggingface/transformers/issues/35157) - \[i18n-Chinese] Translating perf_train_cpu.md to Chinese by [@asdkfjsd](https://redirect.github.com/asdkfjsd) in [#35242](https://redirect.github.com/huggingface/transformers/issues/35242) - Fall back to slow image processor in ImageProcessingAuto when no fast processor available by [@yonigozlan](https://redirect.github.com/yonigozlan) in [#34785](https://redirect.github.com/huggingface/transformers/issues/34785) - Aggeregate test summary files in CircleCI workflow runs by [@ydshieh](https://redirect.github.com/ydshieh) in [#34989](https://redirect.github.com/huggingface/transformers/issues/34989) - Blip: fix offloading and MP tests by [@zucchini-nlp](https://redirect.github.com/zucchini-nlp) in [#35239](https://redirect.github.com/huggingface/transformers/issues/35239) - Fix : model used to test ggml conversion of Falcon-7b is incorrect by [@MekkCyber](https://redirect.github.com/MekkCyber) in [#35083](https://redirect.github.com/huggingface/transformers/issues/35083) - Temporarily disable amd push ci by [@ivarflakstad](https://redirect.github.com/ivarflakstad) in [#35293](https://redirect.github.com/huggingface/transformers/issues/35293) - Delete redundancy for loop checks. by [@zhanluxianshen](https://redirect.github.com/zhanluxianshen) in [#35288](https://redirect.github.com/huggingface/transformers/issues/35288) - \[Whisper] patch float type on mps by [@eustlb](https://redirect.github.com/eustlb) in [#35295](https://redirect.github.com/huggingface/transformers/issues/35295) - Fix typos in Translated Audio Classification Docs by [@jla524](https://redirect.github.com/jla524) in [#35287](https://redirect.github.com/huggingface/transformers/issues/35287) - Translating "translate perf_infer_gpu_multi.md" to Chinese by [@HMJ0628](https://redirect.github.com/HMJ0628) in [#35271](https://redirect.github.com/huggingface/transformers/issues/35271) - Fix wrongs in quicktour\[zh] by [@zhanluxianshen](https://redirect.github.com/zhanluxianshen) in [#35272](https://redirect.github.com/huggingface/transformers/issues/35272) - Improved documentation of Automatic speech recognition by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35268](https://redirect.github.com/huggingface/transformers/issues/35268) - fix modular order by [@ArthurZucker](https://redirect.github.com/ArthurZucker) in [#35297](https://redirect.github.com/huggingface/transformers/issues/35297) - Add sdpa for Beit by [@OmarManzoor](https://redirect.github.com/OmarManzoor) in [#34941](https://redirect.github.com/huggingface/transformers/issues/34941) - Support for SDPA for SAM models by [@MagnusS0](https://redirect.github.com/MagnusS0) in [#34110](https://redirect.github.com/huggingface/transformers/issues/34110) - remove `benchmark` job in `push-important-models.yml` by [@ydshieh](https://redirect.github.com/ydshieh) in [#35292](https://redirect.github.com/huggingface/transformers/issues/35292) - Fix typos in translated quicktour docs by [@jla524](https://redirect.github.com/jla524) in [#35302](https://redirect.github.com/huggingface/transformers/issues/35302) - Fix image preview in multi-GPU inference docs by [@jla524](https://redirect.github.com/jla524) in [#35303](https://redirect.github.com/huggingface/transformers/issues/35303) - Fix remove unused parameter in docs by [@zzzzzsa](https://redirect.github.com/zzzzzsa) in [#35306](https://redirect.github.com/huggingface/transformers/issues/35306) - Add Cohere2 docs details by [@alexrs-cohere](https://redirect.github.com/alexrs-cohere) in [#35294](https://redirect.github.com/huggingface/transformers/issues/35294) - Fixed typo in audio_classification.md by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35305](https://redirect.github.com/huggingface/transformers/issues/35305) - \[docs] Improve register_pipeline by [@stevhliu](https://redirect.github.com/stevhliu) in [#35300](https://redirect.github.com/huggingface/transformers/issues/35300) - Fix loading with only state dict and low_cpu_mem_usage = True by [@SunMarc](https://redirect.github.com/SunMarc) in [#35217](https://redirect.github.com/huggingface/transformers/issues/35217) - \[tests] make cuda-only tests device-agnostic by [@faaany](https://redirect.github.com/faaany) in [#35222](https://redirect.github.com/huggingface/transformers/issues/35222) - Trigger GitHub CI with a comment on PR by [@ydshieh](https://redirect.github.com/ydshieh) in [#35211](https://redirect.github.com/huggingface/transformers/issues/35211) - change bnb tests by [@jiqing-feng](https://redirect.github.com/jiqing-feng) in [#34713](https://redirect.github.com/huggingface/transformers/issues/34713) - \[Whisper] fix docstrings typo by [@eustlb](https://redirect.github.com/eustlb) in [#35319](https://redirect.github.com/huggingface/transformers/issues/35319) - feat: add `benchmarks_entrypoint.py` by [@McPatate](https://redirect.github.com/McPatate) in [#34495](https://redirect.github.com/huggingface/transformers/issues/34495) - Fix documentation for ColPali by [@tonywu71](https://redirect.github.com/tonywu71) in [#35321](https://redirect.github.com/huggingface/transformers/issues/35321) - Update comment CI bot by [@ydshieh](https://redirect.github.com/ydshieh) in [#35323](https://redirect.github.com/huggingface/transformers/issues/35323) - PaliGemma: Make sure to add <eos> to suffix if <image> is present in `text` by [@probicheaux](https://redirect.github.com/probicheaux) in [#35201](https://redirect.github.com/huggingface/transformers/issues/35201) - Fix some fa2 tests by [@ArthurZucker](https://redirect.github.com/ArthurZucker) in [#35340](https://redirect.github.com/huggingface/transformers/issues/35340) - Modernbert Release Fixes by [@warner-benjamin](https://redirect.github.com/warner-benjamin) in [#35344](https://redirect.github.com/huggingface/transformers/issues/35344) - \[`docs`] Add link to ModernBERT Text Classification GLUE finetuning script by [@tomaarsen](https://redirect.github.com/tomaarsen) in [#35347](https://redirect.github.com/huggingface/transformers/issues/35347) - fix onnx export of speech foundation models by [@nikosanto13](https://redirect.github.com/nikosanto13) in [#34224](https://redirect.github.com/huggingface/transformers/issues/34224) - \[`Mamba2`] Fix caching, slow path, and multi-gpu by [@vasqu](https://redirect.github.com/vasqu) in [#35154](https://redirect.github.com/huggingface/transformers/issues/35154) - Reduce CircleCI usage by [@ydshieh](https://redirect.github.com/ydshieh) in [#35355](https://redirect.github.com/huggingface/transformers/issues/35355) - Implement AsyncTextIteratorStreamer for asynchronous streaming by [@CISC](https://redirect.github.com/CISC) in [#34931](https://redirect.github.com/huggingface/transformers/issues/34931) - Cleaner attention interfaces by [@Cyrilvallez](https://redirect.github.com/Cyrilvallez) in [#35342](https://redirect.github.com/huggingface/transformers/issues/35342) - Add Tensor Parallel support for Qwen2VL by [@jla524](https://redirect.github.com/jla524) in [#35050](https://redirect.github.com/huggingface/transformers/issues/35050) - fix zoedepth initialization error under deepspeed zero3 by [@Tavish9](https://redirect.github.com/Tavish9) in [#35011](https://redirect.github.com/huggingface/transformers/issues/35011) - Aurevoir PyTorch 1 by [@ydshieh](https://redirect.github.com/ydshieh) in [#35358](https://redirect.github.com/huggingface/transformers/issues/35358) - bugfix: torch.export failure caused by `_make_causal_mask` by [@jiwoong-choi](https://redirect.github.com/jiwoong-choi) in [#35291](https://redirect.github.com/huggingface/transformers/issues/35291) - update codecarbon by [@nhamanasu](https://redirect.github.com/nhamanasu) in [#35243](https://redirect.github.com/huggingface/transformers/issues/35243) - Update test fetcher when we want to test all by [@ArthurZucker](https://redirect.github.com/ArthurZucker) in [#35364](https://redirect.github.com/huggingface/transformers/issues/35364) - Use `weights_only=True` with `torch.load` for `transfo_xl` by [@ydshieh](https://redirect.github.com/ydshieh) in [#35241](https://redirect.github.com/huggingface/transformers/issues/35241) - Make `test_generate_with_static_cache` even less flaky by [@ydshieh](https://redirect.github.com/ydshieh) in [#34995](https://redirect.github.com/huggingface/transformers/issues/34995) - Improve modular transformers documentation by [@joelpaulkoch](https://redirect.github.com/joelpaulkoch) in [#35322](https://redirect.github.com/huggingface/transformers/issues/35322) - Improved Documentation Of Audio Classification by [@Uvi-12](https://redirect.github.com/Uvi-12) in [#35368](https://redirect.github.com/huggingface/transformers/issues/35368) - \[docs] Follow up register_pipeline by [@stevhliu](https://redirect.github.com/stevhliu) in [#35310](https://redirect.github.com/huggingface/transformers/issues/35310) - owlvit/2 dynamic input resolution by [@bastrob](https://redirect.github.com/bastrob) in [#34764](https://redirect.github.com/huggingface/transformers/issues/34764) - Fix new FA2 if `is_causal` is passed explicitly by [@Cyrilvallez](https://redirect.github.com/Cyrilvallez) in [#35390](https://redirect.github.com/huggingface/transformers/issues/35390) - bitsandbytes: simplify 8bit dequantization by [@matthewdouglas](https://redirect.github.com/matthewdouglas) in [#35068](https://redirect.github.com/huggingface/transformers/issues/35068) - make LlamaModel.\_update_causal_mask torch compilable by [@winglian](https://redirect.github.com/winglian) in [#35187](https://redirect.github.com/huggingface/transformers/issues/35187) - Patch GPTNeoX to use adequate FA2 if position_ids is provided by [@taha-yassine](https://redirect.github.com/taha-yassine) in [#35318](https://redirect.github.com/huggingface/transformers/issues/35318) - uniformize kwargs for SAM by [@tibor-reiss](https://redirect.github.com/tibor-reiss) in [#34578](https://redirect.github.com/huggingface/transformers/issues/34578) - Deprecate \_is_quantized_training_enabled by [@MekkCyber](https://redirect.github.com/MekkCyber) in [#34991](https://redirect.github.com/huggingface/transformers/issues/34991) - Scale loss before backward by [@qgallouedec](https://redirect.github.com/qgallouedec) in [#35207](https://redirect.github.com/huggingface/transformers/issues/35207) - Fix typing in docstring for `PaliGemmaProcessor` by [@alvarobartt](https://redirect.github.com/alvarobartt) in [#35278](https://redirect.github.com/huggingface/transformers/issues/35278) - Fix : VPTQ test by [@MekkCyber](https://redirect.github.com/MekkCyber) in [#35394](https://redirect.github.com/huggingface/transformers/issues/35394) - add bnb support for Ascend NPU by [@statelesshz](https://redirect.github.com/statelesshz) in [#31512](https://redirect.github.com/huggingface/transformers/issues/31512) - bugfix Idefics3 processor - handle gracefully cases with text and no images by [@mfarre](https://redirect.github.com/mfarre) in [#35363](https://redirect.github.com/huggingface/transformers/issues/35363) - Adding logger.info about update_torch_dtype in some quantizers by [@MekkCyber](https://redirect.github.com/MekkCyber) in [#35046](https://redirect.github.com/huggingface/transformers/issues/35046) - Add compile test for fast image processor by [@yonigozlan](https://redirect.github.com/yonigozlan) in [#35184](https://redirect.github.com/huggingface/transformers/issues/35184) - Disable `.github/workflows/self-comment-ci.yml` for now by [@ydshieh](https://redirect.github.com/ydshieh) in [#35366](https://redirect.github.com/huggingface/transformers/issues/35366) - enable non-cuda awq model support without modify version by [@jiqing-feng](https://redirect.github.com/jiqing-feng) in [#35334](https://redirect.github.com/huggingface/transformers/issues/35334) - \[`GPTQ`, `CompressedTensors`] Fix unsafe imports and metada check by [@vasqu](https://redirect.github.com/vasqu) in [#34815](https://redirect.github.com/huggingface/transformers/issues/34815) - Drop inplace operation for loss computation with gradient accumulation by [@qgallouedec](https://redirect.github.com/qgallouedec) in [#35416](https://redirect.github.com/huggingface/transformers/issues/35416) - Fix: Rename keyword argument in_channels to num_channels by [@ningyuv](https://redirect.github.com/ningyuv) in [#35289](https://redirect.github.com/huggingface/transformers/issues/35289) - CLIP conversion script - Change fairseq to OpenAI by [@gau-nernst](https://redirect.github.com/gau-nernst) in [#35384](https://redirect.github.com/huggingface/transformers/issues/35384) - Fix f-string to show `ACCELERATE_MIN_VERSION` on error by [@KSafran](https://redirect.github.com/KSafran) in [#35189](https://redirect.github.com/huggingface/transformers/issues/35189) - Fix `model_accepts_loss_kwargs` for timm model by [@qubvel](https://redirect.github.com/qubvel) in [#35257](https://redirect.github.com/huggingface/transformers/issues/35257) - Update perf_infer_gpu_one.md: fix a typo by [@martin0258](https://redirect.github.com/martin0258) in [#35441](https://redirect.github.com/huggingface/transformers/issues/35441) - Add compute_loss_func to Seq2SeqTrainer by [@d223302](https://redirect.github.com/d223302) in [#35136](https://redirect.github.com/huggingface/transformers/issues/35136) - Update docs for `sdpa_kernel` by [@jla524](https://redirect.github.com/jla524) in [#35410](https://redirect.github.com/huggingface/transformers/issues/35410) - \[i18n-ar] Translated file: `docs/source/ar/tasks/question_answering.md` into Arabic by [@AhmedAlmaghz](https://redirect.github.com/AhmedAlmaghz) in [#35196](https://redirect.github.com/huggingface/transformers/issues/35196) - \[i18n-ar] Translated file: `docs/source/ar/tasks/summarization.md` into Arabic by [@AhmedAlmaghz](https://redirect.github.com/AhmedAlmaghz) in [#35195](https://redirect.github.com/huggingface/transformers/issues/35195) - Update translated docs for `sdpa_kernel` by [@jla524](https://redirect.github.com/jla524) in [#35461](https://redirect.github.com/huggingface/transformers/issues/35461) - Reintroduce Python 3.9 support for ModernBERT by [@tomaarsen](https://redirect.github.com/tomaarsen) in [#35458](https://redirect.github.com/huggingface/transformers/issues/35458) - Fix new BNB test failures by [@matthewdouglas](https://redirect.github.com/matthewdouglas) in [#35345](https://redirect.github.com/huggingface/transformers/issues/35345) - Fix docs typos. by [@zhanluxianshen](https://redirect.github.com/zhanluxianshen) in [#35465](https://redirect.github.com/huggingface/transformers/issues/35465) - Fix paligemma warning message by [@hiyouga](https://redirect.github.com/hiyouga) in [#35486](https://redirect.github.com/huggingface/transformers/issues/35486) #### Significant community contributions The following contributors have made significant changes to the library over the last release: - [@ydshieh](https://redirect.github.com/ydshieh) - Fix private forked repo. CI ([#35114](https://redirect.github.com/huggingface/transformers/issues/35114)) - Change back to `Thread` for SF conversion ([#35236](https://redirect.github.com/huggingface/transformers/issues/35236)) - Use `rsfE` with `pytest` ([#35119](https://redirect.github.com/huggingface/transformers/issues/35119)) - Aggeregate test summary files in CircleCI workflow runs ([#34989](https://redirect.github.com/huggingface/transformers/issues/34989)) - remove `benchmark` job in `push-important-models.yml` ([#35292](https://redirect.github.com/huggingface/transformers/issues/35292)) - Trigger GitHub CI with a comment on PR ([#35211](https://redirect.github.com/huggingface/transformers/issues/35211)) - Update comment CI bot ([#35323](https://redirect.github.com/huggingface/transformers/issues/35323)) - Reduce CircleCI usage ([#35355](https://redirect.github.com/huggingface/transformers/issues/35355)) - Aurevoir PyTorch 1 ([#35358](https://redirect.github.com/huggingface/transformers/issues/35358)) - Use `weights_only=True` with `torch.load` for `transfo_xl` ([#35241](https://redirect.github.com/huggingface/transformers/issues/35241)) - Make `test_generate_with_static_cache` even less flaky ([#34995](https://redirect.github.com/huggingface/transformers/issues/34995)) - Disable `.github/workflows/self-comment-ci.yml` for now ([#35366](https://redirect.github.com/huggingface/transformers/issues/35366)) - [@aymeric-roucher](https://redirect.github.com/aymeric-roucher) - Add Aria ([#34157](https://redirect.github.com/huggingface/transformers/issues/34157)) - [@NielsRogge](https://redirect.github.com/NielsRogge) - \[I-JEPA] Update docs ([#35148](https://redirect.github.com/huggingface/transformers/issues/35148)) - Add DINOv2 with registers ([#35348](https://redirect.github.com/huggingface/transformers/issues/35348)) - [@HMJ0628](https://redirect.github.com/HMJ0628) - \[i18n-<languageCode>] Translating agents.md to Chinese ([#35139](https://redirect.github.com/huggingface/transformers/issues/35139)) - Translating agents_advanced.md to Chinese ([#35231](https://redirect.github.com/huggingface/transformers/issues/35231)) - Translating "translate perf_infer_gpu_multi.md" to Chinese ([#35271](https://redirect.github.com/huggingface/transformers/issues/35271)) - [@alexrs-cohere](https://redirect.github.com/alexrs-cohere) - Add Cohere2 model ([#35224](https://redirect.github.com/huggingface/transformers/issues/35224)) - Add Cohere2 docs details ([#35294](https://redirect.github.com/huggingface/transformers/issues/35294)) - [@ArthurZucker](https://redirect.github.com/ArthurZucker) - fix modular order ([#35297](https://redirect.github.com/huggingface/transformers/issues/35297)) - 🚨All attention refactor🚨 ([#35235](https://redirect.github.com/huggingface/transformers/issues/35235)) - Fix some fa2 tests ([#35340](https://redirect.github.com/huggingface/transformers/issues/35340)) - Update test fetcher when we want to test all ([#35364](https://redirect.github.com/huggingface/transformers/issues/35364)) - [@tonywu71](https://redirect.github.com/tonywu71) - Add ColPali to 🤗 transformers ([#33736](https://redirect.github.com/huggingface/transformers/issues/33736)) - Fix documentation for ColPali ([#35321](https://redirect.github.com/huggingface/transformers/issues/35321)) - [@OmarManzoor](https://redirect.github.com/OmarManzoor) - Add sdpa for Beit ([#34941](https://redirect.github.com/huggingface/transformers/issues/34941)) - [@fabianlim](https://redirect.github.com/fabianlim) - Add the Bamba Model ([#34982](https://redirect.github.com/huggingface/transformers/issues/34982)) - [@warner-benjamin](https://redirect.github.com/warner-benjamin) - Add ModernBERT to Transformers ([#35158](https://redirect.github.com/huggingface/transformers/issues/35158)) - Modernbert Release Fixes ([#35344](https://redirect.github.com/huggingface/transformers/issues/35344)) - [@wejoncy](https://redirect.github.com/wejoncy) - FEAT : Adding VPTQ quantization method to HFQuantizer ([#34770](https://redirect.github.com/huggingface/transformers/issues/34770)) - [@bastrob](https://redirect.github.com/bastrob) - owlvit/2 dynamic input resolution ([#34764](https://redirect.github.com/huggingface/transformers/issues/34764)) - [@BlackSamorez](https://redirect.github.com/BlackSamorez) - HIGGS Quantization Support ([#34997](https://redirect.github.com/huggingface/transformers/issues/34997)) ### [`v4.47.1`](https://redirect.github.com/huggingface/transformers/releases/tag/v4.47.1) [Compare Source](https://redirect.github.com/huggingface/transformers/compare/v4.47.0...v4.47.1) ### Patch release v4.47.1 We waited a little bit to make sure it was stable, thanks [@winglian](https://redirect.github.com/winglian) for double checking and everyone for the fixes! - Fix GA loss bugs and add unit test ([#35121](https://redirect.github.com/your-repo/pull/35121)) Contributed by [@techkang](https://redirect.github.com/techkang) and [@ArthurZucker](https://redirect.github.com/ArthurZucker). - Fix num_items_in_batch not being an integer ([#35115](https://redirect.github.com/your-repo/pull/35115)) Contributed by [@xspirus](https://redirect.github.com/xspirus). - Fix FSDP no longer working ([#35212](https://redirect.github.com/your-repo/pull/35212)) Contributed by [@muellerzr](https://redirect.github.com/muellerzr). - Don't use no_sync when DeepSpeed doesn't support it for certain ZeRO configurations ([#35212](https://redirect.github.com/your-repo/pull/35212)) Contributed by [@winglian](https://redirect.github.com/winglian). - Only import torch.distributed if it is available ([#35133](https://redirect.github.com/your-repo/pull/35133)) Contributed by [@GaetanLepage](https://redirect.github.com/GaetanLepage). - \[Whisper] Patch float type on MPS ([#35295](https://redirect.github.com/your-repo/pull/35295)) Contributed by [@eustlb](https://redirect.github.com/eustlb). 🔜 we should probably have MPS CIs to avoid repeating this! ### [`v4.47.0`](https://redirect.github.com/huggingface/transformers/releases/tag/v4.47.0): v4.47.0: PaliGemma-2, I-JEPA, OLMo-2, LayerSkip, Tensor Parallel [Compare Source](https://redirect.github.com/huggingface/transformers/compare/v4.46.3...v4.47.0) #### New models ##### PaliGemma-2 PaliGemma 2 and PaliGemma are lightweight open vision-language models (VLM) inspired by [PaLI-3](https://arxiv.org/abs/2310.09199), and based on open components like the [SigLIP vision model](https://arxiv.org/abs/2303.15343) and the [Gemma language model](https://arxiv.org/abs/2403.08295). PaliGemma takes both images and text as inputs and can answer questions about images with detail and context, meaning that PaliGemma can perform deeper analysis of images and provide useful insights, such as captioning for images and short videos, object detection, and reading text embedded within images. PaliGemma 2 is available in 3B, 10B, and 28B parameter sizes, which are based on Gemma 2 2B, 9B, and 27B models, respectively. The original PaliGemma models are available in the 3B size. For more information on Gemma model variants, see the [Gemma models list](https://ai.google.dev/gemma/docs/get_started#models-list). PaliGemma model variants support different pixel resolutions for image inputs, including 224 x 224, 448 x 448, and 896 x 896 pixels. <img width="743" alt="image" src="https://github.com/user-attachments/assets/55cda8a6-b463-4a58-b7d3-f7d50ee2fa11"> ##### I-JEPA The I-JEPA model was proposed in [Image-based Joint-Embedding Predictive Architecture](https://arxiv.org/pdf/2301.08243.pdf) by Mahmoud Assran, Quentin Duval, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Yann LeCun, Nicolas Ballas. I-JEPA is a self-supervised learning method that predicts the representations of one part of an image based on other parts of the same image. This approach focuses on learning semantic features without relying on pre-defined invariances from hand-crafted data transformations, which can bias specific tasks, or on filling in pixel-level details, which often leads to less meaningful representations. <img width="413" alt="image" src="https://github.com/user-attachments/assets/561ca9d7-0327-477a-96b8-61d2af0caf34"> - Add I-JEPA by [@jmtzt](https://redirect.github.com/jmtzt) in [#33125](https://redirect.github.com/huggingface/transformers/issues/33125) ##### OLMo 2 <img width="833" alt="image" src="https://github.com/user-attachments/assets/1abdde92-0aae-404a-b83e-77ec8bd13b7f"> The OLMo2 model is the successor of the OLMo model, which was proposed in [OLMo: Accelerating the Science of Language Models](https://arxiv.org/abs/2402.00838). The architectural chan </details> --- ### Configuration 📅 Schedule: Branch creation - "" (UTC), Automerge - At any time (no schedule defined). 🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied. ♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 Ignore: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/dora-rs/dora). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4xNjQuMSIsInVwZGF0ZWRJblZlciI6IjM5LjE2NC4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119-->	11 months ago
Haixuan Xavier Tao	27f6ab52d9	Fix typo in reachy node (#789 )	11 months ago
haixuantao	5102a9e38b	Make the pip release CI only run for dedicated work	11 months ago
haixuantao	d76e38df6f	add readme warnings	11 months ago
haixuantao	4de5c687fd	Fix multiple daemon example	11 months ago
haixuantao	71e365de83	Fix typo in reachy node	11 months ago
haixuantao	a2910b907d	Rename timeout test	11 months ago
haixuantao	6f598ebb44	Adding a test for checking on the latency when used timeout and queue at the same time	11 months ago
haixuantao	4b9111daee	Fix python parsing of float and float list	11 months ago
haixuantao	f919f8fe99	Adding float for env variable and metadata parameters	11 months ago
haixuantao	58d4287d37	Fix bounding box for rerun viewer	11 months ago
haixuantao	961a1dcb79	Add `uv` within start cli command	11 months ago
Philipp Oppermann	7fc8466c0c	Use `paths-ignore` instead of negated `paths` The GitHub workflow docs specify: 'If you define a path with the ! character, you must also define at least one path without the ! character. If you only want to exclude paths, use paths-ignore instead.'	11 months ago
Philipp Oppermann	62eebe56b4	Set default log level to INFO for coordinator and daemon but keep zenoh log level at WARN level	11 months ago
Philipp Oppermann	faf6e3ef7c	Error on daemon registration if machine ID is not unique	11 months ago
Haixuan Xavier Tao	9e3e76e9e4	Adding reachy and dora reachy demo (#784 ) # Reachy Demo This example uses Reachy in parallel of Qwenvl2.5 in order to grasp object. ## To run - [ ] Place Reachy near a table looking at the right. (He's going to turn left) - [ ] Make sure that the height between the table and the torso camera is around 0.32m. You can modify this constraint within the state machine code. ` z = np.clip(z, -0.32, -0.22)` - [ ] Clone and go to this folder: ```bash git clone https://github.com/dora-rs/dora.git --branch add-reachy-clean-demo cd dora/examples/reachy2 ``` - [ ] Make sure to change ROBOT_IP within `demo-dev.yml` with your robot IP address. - [ ] Run the demo with: ```bash # Make sure to not be in a virtual environment uv venv --seed -p 3.11 dora build demo-dev.yml --uv dora run demo-dev.yml --uv # First run might take a little bit more time as model need to be downloaded ``` - [ ] When you see: ``` 2025-02-18T17:18:57.937123Z INFO run_inner: dora_daemon: all nodes are ready, starting dataflow `01951a11-cfb9-7a70-9198-e12799f812a2` self.machine_id= ``` - [ ] Start speaking and saying something in the likes of: Grab the orange > The keywords are ACTIVATION_WORDS: grab pick give output take catch grabs picks gives output takes catches have - [ ] The robot should start moving ## To modify the sequence of action Everything that is partially scripted stand within `state_machine.py` in `example/reachy` folder. Feel free to modify it as you want. ## Requirements - This run on both: - MacOS with ~ 20G of RAM - Linux with ~ 10G of Nvidia VRAM - I have not tried Windows. Demo at: https://app.rerun.io/version/0.22.1/index.html?url=https://huggingface.co/datasets/haixuantao/rerun_dataset/resolve/main/final_data.rrd TODO: - [ ] Publish a remote dataflow to run this example without having to clone this repo.	11 months ago
Haixuan Xavier Tao	0ed6e32c45	Forbid `/` in node IDs (#785 ) We use `/` as separator for input mappings in dataflow YAML files (e.g. input: `node/output_id`). We do support `/` characters in output IDs for runtime nodes. If we allowed `/` characters in node IDs too, splitting a `a/b/c` mapping would no longer be possible (could be node `a` with output `b/c` or node `a/b` with output `c`). Therefore we explicitly error on node IDs containing `/` characters in this commit.	11 months ago
haixuanTao	25d0c12240	Removing upper limit on grasping object	11 months ago

... 3 4 5 6 7 ...

3282 Commits (updating-readme) All Branches Search

3282 Commits (updating-readme)

All Branches