sabesh

Research

(3)

Squeezing a 26B diffusion LLM onto a Mac

TL;DR I set out to make DiffusionGemma - a 26B-parameter, ~4B-active mixture-of-experts block-diffusion text model - fast enough to be a real interactive tool on a single Apple M5 Pro (48 GB, macOS 27 beta). The stock MLX inference path ran at ~18 tok/s on my original short repro, ~23–27 tok/s on a...

2026-06-15

Optimizing DiffusionGemma (26B-A4B-IT-4Bit) on an Apple M5 Pro, what worked, what didn't.

Fine-tuning a model in MLX

I kept staring at my own tweets wondering why some of them landed and some of them didn't. Same brain, same topics, wildly different engagement. After enough scrolling through my own analytics, a dumb question got stuck in my head: my high-engagement tweets clearly share something, some texture or r...

2026-06-02

Building a personal tweet-writing assistant on Apple Silicon

Speculative Decoding in MLX with DFlash

Introduction Speculative decoding is frequently cited as one of the more effective techniques for accelerating inference in large language models, with reported speedups typically falling in the 2x to 4x range over standard autoregressive decoding. The mechanism is straightforward: a smaller draft m...

2026-04-22

An empirical evaluation of speculative decoding on Apple Silicon, across a 300-run parameter sweep

Engineering

(3)

WWDC26: CoreAI, Foundation Models and the future of AI

WWDC26 has been crazy this year. Apple focused on three main categories for the main keynote: Platform improvements (across iOS/macOS and all other operating systems), Trust and safety, Apple Intelligence and Siri. Apple brushed past the OS upgrades (which are usually split into its own sections: iO...

2026-06-09

Thoughts about everything that happened this WWDC, a lot of Siri and what the future of AI on Apple Silicon looks like

Running Local LLMs on Xcode

Apple Silicon is fast enough to run real language models locally. No API keys, no servers, no network round trips. And if you're already an iOS or macOS developer, you don't need to leave Xcode or learn Python to do it. This article walks through setting up a Swift package that runs a quantized Qwen...

2026-05-05

How to get local LLMs running on Xcode using MLXSwift

Clear Segmented Picker

A segmented picker is one of those UI components that sounds trivially simple until you actually try to ship a polished version of it. At its core, it's just a row of options where exactly one is selected — but the way it looks and feels communicates a lot about the quality of your app. A sluggish a...

2026-02-25

Building a transparent, glassy segmented control for iOS — because the native one just isn't good enough.

Events

(2)

Local is why we do this. MLX is how we do it.

There's a quiet shift happening in how builders think about AI. For most of the last few years, using AI meant renting it - an API key, a per-token bill, and your data making a round trip to someone else's servers and back before you ever saw an answer. We think the more interesting version runs som...

2026-06-01

Five cities, one afternoon, and a bet on local AI - MLX India's second community meetup ran simultaneously across Chennai, Bangalore, Mumbai, Delhi, and Hyderabad.

MLX India Community Meetup #1

"How many of you used MLX for the first time?" - half the hands in the room went up. We are MLX India, a community of builders working with MLX, local LLMs, and AI on Apple Silicon. Kautuk and I (Sabesh) are co-organizers of the community. We have over 300 builders on our WhatsApp channel, where we...

2026-05-08

A report on MLX India's first comunity meetup

Psychology

(1)

Are we just bots?

I've been having a shitty few weeks. But I've gotten to hang out and interact with so many different people from diverse backgrounds these last ten days. It feels like I've finally burst free from the tech bubble that Bangalore is, and came face to face with discomfort, harsh truths of life, and oth...

2025-12-07

Do LLMs resemble us, or do we resemble them?

Machine Learning

(2)

Understanding gradient descent

I totally expected to get started with coding at this point. After all, I have a great (basic) understanding of what a perceptron is, and how several of these are arranged in layers within a neural network. And it's intuitive to think of the network as a computer program (because it IS one, or at le...

2025-11-19

WTF is a cost function anyway?

ML endeavors begin

As someone who has been a software developer and a builder for quite some time now, it's been humbling trying to get back to research and studies. Lately, I've been very curious about how the AI tools I use every day in life really work behind the scenes. I've learnt all about it back in my days at...

2025-11-09

Tackling machine learning as a subject, from a builder's perspective