Anyscale’s cover photo
Anyscale

Anyscale

Software Development

San Francisco, California 48,710 followers

Scalable compute for AI and Python

About us

Scalable compute for AI and Python Anyscale enables developers of all skill levels to easily build applications that run at any scale, from a laptop to a data center.

Industry
Software Development
Company size
51-200 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2019

Products

Locations

Employees at Anyscale

Updates

  • THIS WEEK: Anyscale will be at Google Next in Vegas, April 9-11🔥 Come see us at booth #1702 to learn how Ray Turbo + GKE power the distributed OS for AI 🙌 High performance 🚀 | Low cost 🤑 | Massive scale 💪 If you're pushing the limits of AI infra, stop by and talk to us about how Ray Turbo can accelerate your AI infrastructure → https://lnkd.in/gMF2PAn5

    • No alternative text description for this image
  • Anyscale reposted this

    How do Ray and Kubernetes relate? This is one of the most common questions we get, and there's a crisp explanation. Kubernetes is built for platform engineers, people who live and breath YAML and containers, who operate fleets, manage permissions, monitor costs, standardize logging and observability tooling. Ray is built for machine learning people, people who write Python code, who live and breath PyTorch and NumPy and Hugging Face, who think about batch sizes, learning rates, and embedding computations. Each on their own misses part of the picture. Together, they form a software stack for AI that addresses both sets of needs.

  • Scaling AI is hard – but you’re not alone. 🙌 Join us at the Ray + AI Infra Summit (May 20, NYC🗽), where engineers from Character.AISpotifyFigmaBridgewater Associates, and more will share how they’re building fast, reliable, and cost-efficient systems at scale. 🚀 Learn how teams are:  • Managing distributed workloads with Ray  • Reducing training time and infra spend  • Designing systems that scale with product growth Whether you're deep in production or just getting started, this event delivers practical insights you won’t find anywhere else. Reserve your seat → https://lnkd.in/gETcMReC

    • No alternative text description for this image
  • Anyscale reposted this

    🚀 In Ray 2.44, we're giving Ray a major upgrade for scaling LLM inference. We're seeing a ton of companies and users organically using Ray with vLLM to scale LLM serving and batch LLM inference. In theory, the two technologies are very complementary -- vLLM provides best-in-class performance for LLM inference, and Ray is the defacto way for AI infrastructure teams to scale inference workloads. But previously, in order to do this you'd need to write a lot of boilerplate to make your LLM inference performant at scale. In the most recent Ray release, we've launched Ray Data LLM and Ray Serve LLM. These APIs allow for simple, scalable and performant ways of deploying open source LLMs as part of existing data pipelines and Ray Serve applications. In the near future, we'll be working on building out more examples and reference architectures for deploying key models like Deepseek on Ray + vLLM (and SGLang support as well!) Check out our blog for more details: https://lnkd.in/gNhg2BRU

    • No alternative text description for this image
  • Anyscale will be at Google Next in Vegas, April 9-11! AI at scale needs the right foundation 🚀 Visit us at booth #1702 to see how Ray Turbo, the AI compute engine from Anyscale, and GKE power the Distributed OS for AI – delivering top performance, cost efficiency, and scalability. Let’s talk about how Ray Turbo can accelerate your AI workloads. See you there! 👋 https://lnkd.in/gbbwgFXy

    • No alternative text description for this image
  • NEW IN RAY: Native LLM APIs for Ray Data and Ray Serve Libraries 🙌

    View profile for Kourosh Hakhamaneshi

    AI lead @Anyscale, PhD UC Berkeley

    Announcing native LLM APIs in Ray Data and Ray Serve Libraries. These are experimental APIs we are announcing today that abstract two things: 1. Serve LLM: simplifies the deployment of LLM engines (e.g. vLLM) through ray serve APIs. Enables things like auto-scaling, monitoring, LoRA management, resource allocation etc. 2. Data LLM: Helps you scale up offline inference horizontally for throughput sensitive applications (e.g. data curation, evaluation, etc). Ray data's lazy execution engine helps you pipeline complex heterogenous stages that involve LLMs. Say you want to create a pipeline that reformats input with Llama-8B and then queries Llama-70B in another stage. How do you maximize throughput for this pipeline? Or a vision language model that needs to pull images from s3 (a network bounded operation), tokenization (a cpu bounded op) and then inference with Pixtral (a gpu bounded op). This is the type of problem that Data LLM API will simplify. https://lnkd.in/gSnAs4kf

  • The Ray + AI Infra Summit (NYC🗽, May 20th) offers something truly special: direct access to the brilliant minds who created the technology. This isn't just another tech talk – it's a master class from the pioneers who are defining the future of AI infrastructure. Whether you're already using Ray or exploring solutions for scaling your AI workloads, this summit provides insights you won't find anywhere else. Limited seats available, save your spot today → https://lnkd.in/gETcMReC

    • No alternative text description for this image

Similar pages

Browse jobs

Funding