Scaling Large ML Models to Small Devices with Atila Orhon

The size of ML models is growing into the many billions of parameters. This poses a challenge for running inference on non-dedicated hardware like phones and laptops.

Argmax is a startup focused on developing methods to run large models on commodity hardware. A key observation behind their strategy is that the largest models are getting larger, but the smallest models that are commercially relevant are getting smaller. The company was started in 2023 and has raised money from General Catalyst and other industry leaders.

Atila Orhon is the founder of Argmax and he previously worked at Apple and NVIDIA. He joins the show to talk about working in computer vision, building ML tooling at Apple, optimizing ML models, and more.

Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer .


Sponsorship inquiries:


Notion isn’t just another workspace, it’s a complete productivity ecosystem. With its sleek design and powerful AI integration, Notion anticipates my needs before I even realize them. Notion is a place where any team can write, plan, organize, and rediscover the joy of play. It’s a workspace design, not just for making progress, but getting inspired.

Try Notion for free when you go to

This episode of Software Engineering Daily is brought to you by Vantage.

Do you know what your cloud bill will be for this month?

For many companies, cloud costs are the number two line item in their budget and the number one fastest growing category of spend.

Vantage helps you get a handle on your cloud bills, with self-serve reports and dashboards built for engineers, finance, and operations teams.

With Vantage, you can put costs in the hands of the service owners and managers who generate them—giving them budgets, alerts, anomaly detection, and granular visibility into every dollar.

With native billing integrations with dozens of cloud services, including AWS, Azure, GCP, Datadog, Snowflake, and Kubernetes, Vantage is the one FinOps platform to monitor and reduce all your cloud bills.

To get started, head to, connect your accounts, and get a free savings estimate as part of a 14-day free trial.

This episode of Software Engineering Daily is brought to you by Starburst.

Struggling to deliver analytics at the speed your users want without your costs snowballing?

For data engineers who battle to build and scale high quality data pipelines, Starburst’s data lakehouse platform helps you deliver exceptional user experiences at peta-byte scale, without compromising on performance or cost.

Trusted by the teams at Comcast, Doordash, and MIT, Starburst delivers the adaptability and flexibility a lakehouse ecosystem promises on an open architecture that supports – Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data.

Want to see Starburst in action? Get started today with a free trial at

Software Daily

Software Daily

Subscribe to Software Daily, a curated newsletter featuring the best and newest from the software engineering community.