Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

IlyasMoutawwakil 
posted an update 2 days ago
view post
Post
1591
After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly 🔥

Why it had to be done 👇
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !

Transformers models are now easier to:
⚙️ Compile end-to-end with torch.compile backends
📦 Export reliably via torch.export and torch.onnx.export
🚀 Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.

This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.

We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.

There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.

PR in the comments ! More updates coming coming soon !
  • 1 reply
·
danielhanchen 
posted an update 1 day ago
view post
Post
1823
You can now fine-tune embedding models in our free Unsloth notebook! 🤗

Fine-tuning embedding models improves retrieval & RAG by aligning vectors to your domain-specific notion of similarity, improving search, clustering, and recommendations on your data.

⭐ Blog + Notebooks: https://unsloth.ai/docs/new/embedding-finetuning

Unsloth trains embedding models 1.8-3.3x faster with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

We'd like to thank Hugging Face and Unsloth contributor: electroglyph for making this possible!
  • 1 reply
·
codelion 
posted an update about 19 hours ago
view post
Post
517
Reverse Engineering a $500M Mystery: From HashHop to Memory-Augmented Language Models

I wrote a deep dive into how Magic AI's 100M token context window might work, starting from their HashHop benchmark and building up to MALM - a Memory-Augmented Language Model.

Key insight: treating each key as a single token enables perfect retrieval at unlimited context lengths.

The article covers:

- How HashHop works and why its perfect accuracy is suspicious
- Building a tokenized solver that achieves 100% accuracy
- Scaling to MALM for real code search tasks
- Why this approach could handle 100M+ tokens

Read the full article: https://huggingface.co/blog/codelion/reverse-engineering-magic-hashhop

Try the model: codelion/malm-165m

Code: https://github.com/codelion/hash-hop
Juanxi 
posted an update 1 day ago
view post
Post
1143
Recent Updates on ScalingOpt | Your Stars are Appreciated

We are pleased to announce several key updates to the ScalingOpt project:

Pyramid Visualization Structure
Following a suggestion from Yufei, we have introduced a pyramid-based visualization framework to systematically outline the layered architecture of Foundation Models—from foundational principles to infrastructure-level details. This addition is designed to assist teams in organizing and presenting related materials more clearly.

Integration of Optimizer Summaries by Yifeng
We extend a warm welcome to Yifeng (author of MARS), who has joined the project. He has contributed a comprehensive summary of over 100 optimizers, now available in ScalingOpt. This resource can be accessed via the “Optimization Summary Sheet” on the homepage or under the Optimizers page, featuring a reader-friendly interface that supports easy viewing, downloading, and citation.

Growing Community of Members
We continue to update and expand the list of active members. Researchers interested in Optimization & Efficient AI are encouraged to join and participate in discussions. Feedback and suggestions are also highly welcomed and will be reviewed and incorporated on an ongoing basis.

Tutorials in Progress
The tutorial development is actively underway. Currently, we have prepared over 300 slides and are refining and expanding the content in collaboration with contributors.

This community is driven purely by passion and a commitment to open knowledge sharing. Your support through starring the repository is greatly appreciated!
  • 1 reply
·
mahimairaja 
posted an update 1 day ago
Ujjwal-Tyagi 
posted an update 1 day ago
view post
Post
1237
There is a new open-source music generation model called HeartMuLa. It offers strong, competitive performance compared to Suno and supports English, Chinese, Japanese, Korean, and Spanish. It is optimized to run easily on RTX GPUs and other consumer-grade hardware. HeartMuLa/HeartMuLa-oss-3B
https://github.com/HeartMuLa/heartlib
  • 1 reply
·
prithivMLmods 
posted an update about 21 hours ago
view post
Post
633
Introducing QIE-2511-Zoom-Master for highlight-guided area zoom-in, enabling lossless zooming within a drawn square area, and QIE-2511-Object-Remover-v2 for precise object or highlight-guided area cleanup. These experimental adapters are trained based on QIE-2511. Find the adapters below.

🕹️QIE-2511-Zoom-Master : prithivMLmods/QIE-2511-Zoom-Master
🕹️QIE-2511-Object-Remover-v2: prithivMLmods/QIE-2511-Object-Remover-v2

🤗Demo: prithivMLmods/Qwen-Image-Edit-Object-Manipulator

📂Collection: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-exps

To learn more, visit the app page or the respective model pages.
  • 1 reply
·
DavidAU 
posted an update 1 day ago
consome2 
posted an update about 2 hours ago
view post
Post
34
We’ve released two conversational speech datasets from oto on Hugging Face 🤗
Both are based on real, casual, full-duplex conversations, but with slightly different focuses.

Dataset 1: Processed / curated subset
otoearth/otoSpeech-full-duplex-processed-141h
* Full-duplex, spontaneous multi-speaker conversations
* Participants filtered for high audio quality
* PII removal and audio enhancement applied
* Designed for training and benchmarking S2S or dialogue models

Dataset 2: Larger raw(er) release
otoearth/otoSpeech-full-duplex-280h
* Same collection pipeline, with broader coverage
* More diversity in speakers, accents, and conversation styles
* Useful for analysis, filtering, or custom preprocessing experiments

We intentionally split the release to support different research workflows:
clean and ready-to-use vs. more exploratory and research-oriented use.

The datasets are currently private, but we’re happy to approve access requests — feel free to request access if you’re interested.

If you’re working on speech-to-speech (S2S) models or are curious about full-duplex conversational data, we’d love to discuss and exchange ideas together.

Feedback and ideas are very welcome!
kanaria007 
posted an update 1 day ago
view post
Post
395
✅ New Article: *Jumps as Atomic Moves* (v0.1)

Title:
🧠 Jumps: Atomic Moves in Structured Intelligence (and How to Make Them Safe)
🔗 https://huggingface.co/blog/kanaria007/jumps-atomic-moves-in-si

---

Summary:
In SI-Core, a *Jump* is the smallest *effectful* unit of reasoning+action: a move that consumes observations, proposes/chooses an action, and (optionally) commits results + memory updates.

This article makes Jumps operational: *what a Jump must declare*, how it is gated (OBS/ETH/RML), how it produces auditable traces, and how to keep it safe under uncertainty—without collapsing into “just prompt chaining.”

> If you can’t name the Jump, you can’t audit it.
> If you can’t gate it, you can’t ship it.

---

Why It Matters:
• Stops hidden behavior: every effectful move becomes *declared + inspectable*
• Prevents “jumping in the dark” via *OBS gating + sandbox-only paths*
• Makes policy enforceable: ETH overlay can *allow/modify/block/escalate* per Jump type
• Improves rollback reality: map Jump effects to *RML level*, not vibes
• Enables evaluation that matters: jump traces → *SCover / CAS / RIR / SCI* and failure taxonomy

---

What’s Inside:
• A practical Jump contract: inputs/required obs, scope, candidate generation, chooser policy, outputs, memory writes
• Gate sequence: *OBS → eval_pre → (sandbox) → ETH → commit → RML trace → ledger*
• Jump taxonomy: read-only / advisory / effectful / irreversible, and how to treat each
• Safety patterns: conservative defaults, human-in-loop, break-glass, and “publish_result=false” sandboxes
• Testing: golden traces, property tests, chaos drills, and “why this jump?” explainability hooks

---

📖 Structured Intelligence Engineering Series
this is the *how-to-implement / how-to-operate* layer for Jumps as atomic, auditable moves.