Hugging Face

Software Development

The AI community building the future.

See jobs Follow

Discover all 519 employees

About us

The AI community building the future.

Website: https://huggingface.co
External link for Hugging Face
Industry: Software Development
Company size: 51-200 employees
Type: Privately Held
Founded: 2016
Specialties: machine learning, natural language processing, and deep learning

Products

Hugging Face

Natural Language Processing (NLP) Software

We’re on a journey to solve and democratize artificial intelligence through natural language.

Locations

Primary

Get directions
Paris, FR

Get directions

Employees at Hugging Face

See all employees

Updates

Hugging Face reposted this
Avijit Ghosh, PhD

Applied Policy Researcher at Hugging Face 🤗
21h
Report this post
🚨 New Article: Empowering Public Organizations: Preparing Your Data for the AI Era, with Yacine Jernite 🏛 Public organizations are authoritative sources for critical information: monitoring environmental conditions, tracking educational outcomes, documenting workforce trends, preserving cultural heritage, and managing public infrastructure. However, much of this data exists in formats that AI systems can't easily use — stored in PDFs, scattered across Excel files with inconsistent structures, and often organized in specialized formats designed for human consumption rather than machine learning. 📊 This means that the public data commons that power models, especially models made by small developers who cannot afford millions of dollars in licensing fees, would benefit the most if this data were made available in AI-consumable formats. Data quality is incredibly important for model performance and efficiency (we have written about this, too!), and this is already public knowledge and free! 🤝 There are clear benefits to doing this: Orgs get to enable technology that better serves communities, amplify the value of public data through collaboration, and maintain principled control over data use. Orgs like NASA - National Aeronautics and Space Administration, the National Library of Norway, the French Ministère de la Culture, The National Archives of Finland / Kansallisarkisto, and other public organizations are already on Hugging Face, releasing their rich datasets and models that add to the public commons and enrich us all. 📝 So, we wrote a guide for all public organizations to do so! We use the Massachusetts Data Hub as a case study for this article and convert four datasets! We look at: - MCAS education data 📚 (Excel files with different formats) - Labor market reports 💼 (PowerPoint presentations) - Occupational safety stats 🦺 (PDF reports) - 2023 aerial imagery 🛰️ (JP2 image files) For each of these datasets, we show why they were not AI-ready, the steps we took to clean, standardize, and convert them to Hugging Face datasets, and the release of both the code and datasets on the Hub! This was a really fun exercise, and we have some important takeaways for other public organizations that are looking to release data on Hugging Face to add to public knowledge and power better AI: 1️⃣ Identify Your Most Valuable Datasets: Which dataset is your organization the most authoritative source for, and releasing which in AI-ready formats would bring the most value to your mission? 2️⃣ Determine Format Needs: Consider who will be using this data and for what purpose, and how best to release this dataset so that it is optimal for downstream use. 3️⃣ Document Clearly: Rich documentation is important for downstream users and actually drives adoption! A study has shown that almost 90% of dataset downloads on the Hub have come from fully documented datasets. We can't wait to see your datasets on 🤗! [Article linked in comments]
2 Comments

Like Comment Share
Hugging Face reposted this
Sayak Paul

ML @ Hugging Face 🤗
1d
Report this post
We have got a new Diffusers release for you and it ships a truckload of things 🚙 Bringing you a bunch of new image & video gen models, a wide suite of memory optimization techniques with caching, & torch.compile() support when hotswapping LoRAs. I am missing much more than I can write here. So, please check out the release notes 🔥 Release notes are here: https://lnkd.in/gDX3vT57

3 Comments

Like Comment Share
Hugging Face reposted this
Daniel Vila Suero

Building data tools @ Hugging Face 🤗
1d Edited
Report this post
In times of hype, run your own experiments. How? Use Hugging Face Inference Providers With every new open model release, social media timelines are full of contradicting information, exaggerated claims, etc. That's why running quick (and cheap) experiments is becoming critical. - Have you heard that the latest Llama 4 models are bad? - Have you heard that Llama 4 models behave differently across providers? - Is QwQ-32B better than DeepSeek R1? Run these models through data you care about. With the Hub you can: - Get access to the latest models (from Day 0). - Test them even if you don't have GPUs. - Mix and match the fastest, most reliable inference providers. - Discuss and learn about these models with the largest AI community The prompt and results in the image attached are part of "vibench" a tiny benchmark I'm building with Inference Providers. It contains interesting and challenging prompts from Reddit, the Sparks of AGI Microsoft's paper, and other places. You can find the open dataset in the first comment, and feel free to suggest challenging prompts to add them to vibench. and misinformation, run your own experiments. How? Use Hugging Face Inference Providers With every new open model release, social media timelines are full of contradicting information, exaggerated claims, etc. That's why running quick (and cheap) experiments is becoming critical. - Have you heard that the latest Llama 4 models are bad? - Have you heard that Llama 4 models behave differently across providers? - Is QwQ-32B better than DeepSeek R1? Run these models through data you care about. With the Hub you can: - Get access to the latest models (from Day 0). - Test them even if you don't have GPUs. - Mix and match the fastest, most reliable inference providers. - Discuss and learn about these models with the largest community. The prompt and results in the image attached are part of "vibench" a tiny benchmark I'm building with Inference Providers. It contains interesting and challenging prompts from Reddit, the Sparks of AGI Microsoft's paper, and other places. You can find the open dataset in the first comment, and feel free to suggest challenging prompts to add them to vibench.
11 Comments

Like Comment Share
Hugging Face reposted this
Bespoke Labs

2,151 followers
1d
Report this post
Announcing 📢 Reasoning Datasets Competition 📢 in collaboration with Hugging Face & Together AI Since the launch of DeepSeek-R1 this January, we’ve seen an explosion of reasoning-focused datasets: OpenThoughts-114k, OpenCodeReasoning, codeforces-cot, and more. OpenThoughts-114k alone has helped train 230+ models. Most datasets focus on math, coding, or science, domains where answers are clear-cut and verifiable. But reasoning is starting to push into messier, more human areas: Finance, Medicine and even multi-domain reasoning. The next big leap in LLMs could come from better datasets that mirror real-world ambiguity, complexity, and nuance. To help push the frontier, we’re launching a Reasoning Dataset Competition. More details here: https://lnkd.in/g33aSYtC
1 Comment

Like Comment Share
Hugging Face reposted this
Gradio

61,636 followers
1d
Report this post
Did you know that every Gradio app can be used out-of-the-box as tool within an MCP Server? 🔥 Also, that you can use any Gradio app as an MCP Client? 🤩 Stay tuned, we are bringing the power of MCP Clients and Servers to everyone!!

5 Comments

Like Comment Share
Hugging Face reposted this
Freddy Boulton

Software Engineer @ 🤗
1d
Report this post
Big news! Hugging Face + Cloudflare are teaming up to give AI devs instant access to enterprise WebRTC infrastructure worldwide. Build real-time voice & video AI apps with FastRTC and stream 10GB of data for free with your HF token. Check out the Llama 4 voice chat demo!

7 Comments

Like Comment Share
Hugging Face reposted this
Gradio

61,636 followers
2d
Report this post
Gradio Dashboards are all the rage now 🔥
1 Comment

Like Comment Share
Hugging Face reposted this
Gradio

61,636 followers
2d
Report this post
🚀Breakthrough Alert: Gradio 5.24 is out! We've completely rebuilt our ImageEditor component based on developer feedback: 🤯 Enjoy Photoshop like features in your AI apps: > Professional-grade zooming and panning > Full transparency control > Advanced layer configuration, and many more.. This update is to enable everyone to build sophisticated inpainting and sketching interfaces with just a few lines of Python 🔥 Upgrade today: pip install --upgrade gradio
5 Comments

Like Comment Share
Hugging Face reposted this
Andrés Marafioti

AI Researcher @ Hugging Face | 9+ YOE in GenAI, MLOps, & Research | Pushing the Boundaries of Open-Source AI
2d
Report this post
Today, we share the tech report for 𝗦𝗺𝗼𝗹𝗩𝗟𝗠: 𝗥𝗲𝗱𝗲𝗳𝗶𝗻𝗶𝗻𝗴 𝘀𝗺𝗮𝗹𝗹 𝗮𝗻𝗱 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗺𝘂𝗹𝘁𝗶𝗺𝗼𝗱𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀. 🔥 Explaining how to create a 𝘁𝗶𝗻𝘆 𝟮𝟱𝟲𝗠 𝗩𝗟𝗠 that uses less than 1GB of RAM and outperforms our 80B models from 18 months ago! Here are the coolest insights from our experiments: ✨ Longer context = Big wins: Increasing the context length from 2K to 16K gave our tiny VLMs a 60% performance boost! ✨ Smaller is smarter with SigLIP: Surprise! Smaller LLMs didn't benefit from the usual large SigLIP (400M). Instead, we use the 80M base SigLIP that performs equally well at just 20% of the original size! ✨ Pixel shuffling magic: Aggressively pixel shuffling helped our compact VLMs "see" better, achieving the same performance with sequences 16x shorter! ✨ Learned positional tokens FTW: For compact models, learned positional tokens significantly outperform raw text tokens, enhancing efficiency and accuracy. ✨ System prompts and special tokens are key: Introducing system prompts and dedicated media intro/outro tokens significantly boosted our compact VLM’s performance—especially for video tasks. ✨ Less CoT, more efficiency: Turns out, too much Chain-of-Thought (CoT) data actually hurts performance in small models. They dumb ✨ Longer videos, better results: Increasing video length during training enhanced performance on both video and image tasks. 🌟 State-of-the-Art Performance, SmolVLM comes in three powerful yet compact sizes—256M, 500M, and 2.2B parameters—each setting new SOTA benchmarks for their hardware constraints in image and video understanding. 📱 Real-world Efficiency: We've created an app using SmolVLM on an iPhone 15 and got real-time inference directly from its camera! 🌐 Browser-based Inference? Yep! We get lightning-fast inference speeds of 40-80 tokens per second directly in a web browser. No tricks, just compact, efficient models! If you’re into efficient multimodal models, you’ll love this one.
31 Comments

Like Comment Share
Hugging Face reposted this
Aymeric Roucher

Building agents @ Hugging Face 🤗 | Polytechnique - Cambridge
2d
Report this post
One overlooked aspect of the Llama-4 release on the Hugging Face Hub: this was another successful test for our new Xet backend, that allows near-instantaneous model changes. In short, Xet is a new storage backend that replaces Git backend. Its first advantage is better compression, meaning the first model download is just faster. Something like a 2x improvement for Llama-4. That's already cool ⚡️ But the core feature is just wild: Xet uses content-defined chunking (CDC), to deduplicate at the level of bytes (~64KB chunks of data) instead of files. This means that if you change one line in a huge parquet file, Xet sees the diff on the line-level rather than the huge-file-level. Then a change takes only an instant single-line upload instead of hours to upload/download the whole file. ⚡️⚡️ Congrats to the XetHub team, awesome work 👏
4 Comments

Like Comment Share

Browse jobs

Funding

Hugging Face 8 total rounds

Last Round

Series unknown Sep 1, 2024

See more info on crunchbase

Hugging Face

Software Development

The AI community building the future.

About us

Products

Hugging Face

Natural Language Processing (NLP) Software

Locations

Employees at Hugging Face

Ludovic Huraux

Bassem ASSEH

Rajat Arya

Tech Lead & Software Engineer @ HF | prev: co-founder XetHub, Apple, Turi, AWS, Microsoft

Jeff Boudier

Product + Growth at Hugging Face

Updates

Join now to see what you are missing

Similar pages

Anthropic

Mistral AI

Perplexity

OpenAI

LangChain

Generative AI

DeepLearning.AI

Google DeepMind

LlamaIndex

Cohere

Browse jobs

Engineer jobs

Machine Learning Engineer jobs

Scientist jobs

Software Engineer jobs

Analyst jobs

Intern jobs

Developer jobs

Manager jobs

Product Manager jobs

Director jobs

Python Developer jobs

Data Scientist jobs

Data Analyst jobs

Senior Software Engineer jobs

Project Manager jobs

Researcher jobs

Associate jobs

Data Engineer jobs

Vice President jobs

Specialist jobs

Funding