Plop

Team
community
Activity Feed

AI & ML interests

None defined yet.

lysandreΒ 
posted an update 3 months ago
view post
Post
7082
We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !

v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.

Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago!
  • 6 replies
Β·
hlarcherΒ 
posted an update 4 months ago
view post
Post
344
GH200 cooking time πŸ§‘β€πŸ³πŸ”₯!

We just updated GPU-fryer 🍳 to run on Grace Hopper Superchip (GH200) - fully optimized for ARM-based systems!
With this release, we switched to cuBLASLt to support running FP8 benchmarks. You can monitor GPU throttling, TFLOPS outliers, HBM memory health, and ensure that you get the most of your hardware setup.
Perfect for stress testing and tuning datacenter GPUs.

Check it out on Github πŸ‘‰ https://github.com/huggingface/gpu-fryer
WauplinΒ 
posted an update 5 months ago
view post
Post
3245
Say hello to hf: a faster, friendlier Hugging Face CLI ✨

We are glad to announce a long-awaited quality-of-life improvement: the Hugging Face CLI has been officially renamed from huggingface-cli to hf!

So... why this change?

Typing huggingface-cli constantly gets old fast. More importantly, the CLI’s command structure became messy as new features were added over time (upload, download, cache management, repo management, etc.). Renaming the CLI is a chance to reorganize commands into a clearer, more consistent format.

We decided not to reinvent the wheel and instead follow a well-known CLI pattern: hf <resource> <action>. Isn't hf auth login easier to type and remember?

The full rationale, implementation details, and migration notes are in the blog post: https://huggingface.co/blog/hf-cli

cfahlgren1Β 
posted an update 6 months ago
view post
Post
764
I ran the Anthropic Misalignment Framework for a few top models and added it to a dataset: cfahlgren1/anthropic-agentic-misalignment-results

You can read the reasoning traces of the models trying to blackmail the user and perform other actions. It's very interesting!!

cfahlgren1Β 
posted an update 6 months ago
celinahΒ 
posted an update 7 months ago
view post
Post
2644
✨ Today we’re releasing Tiny Agents in Python β€” an MCP-powered Agent in ~70 lines of code 🐍

Inspired by Tiny Agents in JS from @julien-c , we ported the idea to Python and integrated it directly into huggingface_hub β€” with a built-in MCP Client and a Tiny Agents CLI.

TL;DR: With MCP (Model Context Protocol), you can expose tools like web search or image generation and connect them directly to LLMs. It’s simple β€” and surprisingly powerful.

pip install "huggingface_hub[mcp]>=0.32.0"

We wrote a blog post where we show how to run Tiny Agents, and dive deeper into how they work and how to build your own.
πŸ‘‰ https://huggingface.co/blog/python-tiny-agents

  • 1 reply
Β·
cfahlgren1Β 
posted an update 7 months ago
view post
Post
1730
Yesterday, we dropped a new conversational viewer for datasets on the hub! πŸ’¬

Actually being able to view and inspect your data is extremely important. This is a big step in making data more accessible and actionable for everyone.

Here's some datasets you can try it out on:
β€’ mlabonne/FineTome-100k
β€’ Salesforce/APIGen-MT-5k
β€’ open-thoughts/OpenThoughts2-1M
β€’ allenai/tulu-3-sft-mixture

Any other good ones?
  • 1 reply
Β·
WauplinΒ 
posted an update 8 months ago
view post
Post
2330
‼️ huggingface_hub's v0.30.0 is out with our biggest update of the past two years!

Full release notes: https://github.com/huggingface/huggingface_hub/releases/tag/v0.30.0.

πŸš€ Ready. Xet. Go!

Xet is a groundbreaking new protocol for storing large objects in Git repositories, designed to replace Git LFS. Unlike LFS, which deduplicates files, Xet operates at the chunk levelβ€”making it a game-changer for AI builders collaborating on massive models and datasets. Our Python integration is powered by [xet-core](https://github.com/huggingface/xet-core), a Rust-based package that handles all the low-level details.

You can start using Xet today by installing the optional dependency:

pip install -U huggingface_hub[hf_xet]


With that, you can seamlessly download files from Xet-enabled repositories! And don’t worryβ€”everything remains fully backward-compatible if you’re not ready to upgrade yet.

Blog post: https://huggingface.co/blog/xet-on-the-hub
Docs: https://huggingface.co/docs/hub/en/storage-backends#xet


⚑ Inference Providers

- We’re thrilled to introduce Cerebras and Cohere as official inference providers! This expansion strengthens the Hub as the go-to entry point for running inference on open-weight models.

- Novita is now our 3rd provider to support text-to-video task after Fal.ai and Replicate.

- Centralized billing: manage your budget and set team-wide spending limits for Inference Providers! Available to all Enterprise Hub organizations.

from huggingface_hub import InferenceClient
client = InferenceClient(provider="fal-ai", bill_to="my-cool-company")
image = client.text_to_image(
    "A majestic lion in a fantasy forest",
    model="black-forest-labs/FLUX.1-schnell",
)
image.save("lion.png")


- No more timeouts when generating videos, thanks to async calls. Available right now for Fal.ai, expecting more providers to leverage the same structure very soon!
Β·
ngxsonΒ 
posted an update 9 months ago
view post
Post
5464
A comprehensive matrix for which format should you use.

Read more on my blog post: https://huggingface.co/blog/ngxson/common-ai-model-formats

| Hardware        | GGUF      | PyTorch                | Safetensors              | ONNX  |
|-----------------|-----------|------------------------|--------------------------|-------|
| CPU             | βœ… (best) | 🟑                      | 🟑                       | βœ…    |
| GPU             | βœ…        | βœ…                      | βœ…                       | βœ…    |
| Mobile          | βœ…        | 🟑 (via executorch)     | ❌                       | βœ…    |
| Apple silicon   | βœ…        | 🟑                      | βœ… (via MLX framework)   | βœ…    |
  • 1 reply
Β·
lysandreΒ 
posted an update 10 months ago
view post
Post
8205
SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
  • 1 reply
Β·
cfahlgren1Β 
posted an update 10 months ago
view post
Post
2351
If you haven't seen yet, we just released Inference Providers πŸ”€

> 4 new serverless inference providers on the Hub 🀯
> Use your HF API key or personal key with all providers πŸ”‘
> Chat with Deepseek R1, V3, and more on HF Hub πŸ‹
> We support Sambanova, TogetherAI, Replicate, and Fal.ai πŸ’ͺ

Best of all, we don't charge any markup on top of the provider 🫰 Have you tried it out yet? HF Pro accounts get $2 of free usage for the provider inference.
ngxsonΒ 
posted an update 11 months ago
hlarcherΒ 
posted an update 11 months ago
view post
Post
1171
We are introducing multi-backend support in Hugging Face Text Generation Inference!
With new TGI architecture we are now able to plug new modeling backends to get best performances according to selected model and available hardware. This first step will very soon be followed by the integration of new backends (TRT-LLM, llama.cpp, vLLM, Neuron and TPU).

We are polishing the TensorRT-LLM backend which achieves impressive performances on NVIDIA GPUs, stay tuned πŸ€— !

Check out the details: https://huggingface.co/blog/tgi-multi-backend
ngxsonΒ 
posted an update 11 months ago
view post
Post
3931
Check out my collection of pre-made GGUF LoRA adapters!

This allow you to use both normal + abliterated version of popular models like llama, qwen, etc, without having to double to amount of VRAM usage.

ngxson/gguf_lora_collection
Β·