Seroter's Daily Reading — #802 (June 10, 2026) — Seroter's Daily Reading

Listen: https://blossom.nostr.xyz/5062166c84af59b8df3e80afd5eaa6540b809a3475dc908e0a869756497841ea.mpga

Seroter's Daily Reading, Episode 802. June 10, 2026.

Big episode today. Eleven pieces, and a good mix of strategic career thinking, AI tooling deep dives, and some genuinely surprising research. Let's get into it.

Starting with something that might make your manager nervous. Sean Goedecke has a post called "Doing Nothing at Work", and his argument is that the best engineers deliberately run at about 80% utilization. Not because they're slacking, but because high-impact work is time-dependent. You can't just decide one morning to unblock a big enterprise deal or swoop in on an incident. Those moments arrive on their own schedule. If you're already at 100% grinding JIRA tickets, you're too busy to notice, and too busy for your manager to tag you in. Goedecke's point is that "doing nothing" is actually the space where things can happen. It gives your brain room to rest, to observe, to connect dots. He even argues that engineers should generally avoid glue work — not because it isn't valuable, but because uncompensated organizational patching insulates the company from feeling the consequences of its own priorities. It's a counterintuitive take, and worth sitting with.

Shifting to AI infrastructure, Davin Porter wrote about routing AI requests across a fleet of local models. He wanted to go beyond calling a single local model and instead create a worker pool pattern, where multiple machines can pick up tasks from a queue, run them against local LLMs, and return results. He built this into Smartrouter, using llama.cpp, with runners that register themselves and handle queuing. The interesting part is that the routing layer can dynamically change which model a request gets sent to, based on rules — so you can route to a local Gemma model or fall back to something else depending on the workload. Porter also reflects on what he calls the pain points of using LLMs to help build this kind of infrastructure: definitions really matter, and the LLMs kept conflating his "machine agent" concept with the general idea of an agent. After clarifying the terminology, things improved. He also notes that LLMs sometimes report a 200 response even when it isn't actually a success — a good reminder to wire in real logs rather than trusting the happy path.

Builder.io published a piece on what they're calling Agent Experience, or AX — the layer between the AI model and the codebase. The idea is that developer onboarding used to be seasonal: how fast can a new human understand a repo? With agents, onboarding happens at the start of every task. So the instructions, rules, skills, and context you give an agent shape everything it does. Their concern is that teams accumulate a graveyard of stale skills, outdated AGENTS.md rules, and hidden tool instructions that pile up over time. When an agent makes a bad choice, debugging which stale instruction caused the problem becomes a nightmare. Their prescription is to treat agent context like good code: keep global context thin, make rules transparent so a reviewer can trace which one shaped an action, and test skills so the agent knows when to invoke them. Minimal, transparent, tested.

There's also an interesting data point from Lovable, the vibe coding platform. They announced hitting 500 million dollars in annualized revenue, with a million new projects created every week. There's also more exploration this week into generative UIs — using models like Gemini with structured outputs to scaffold Flutter applications — and into building OWASP security compliance checks into the development workflow via the Gemini CLI. There was a sense late last year that vibe coding platforms had peaked and were losing steam. That does not appear to be the case. These platforms are clearly not a fad.

On the more measured side, DX — the company behind the DORA metrics — published research on the actual impact of AI on engineering velocity. The findings are sober. Most organizations in their sample are seeing pull request throughput increase by about 10 to 15 percent. The median is closer to 8 percent. That's a long way from the 10x productivity gains that industry headlines promise. The reason is straightforward: developers spend only about 14 percent of their time writing code. If AI only accelerates that slice, the overall impact is constrained. The rest of the time goes to planning, reviews, testing, documentation, and coordination — where AI help is much thinner. They also flag a concept called cognitive debt: AI can increase output while reducing understanding. Developers ship code more quickly while building a weaker mental model of the systems they maintain. That's a long-term cost that may not show up in velocity metrics for months. DX is also now measuring agent productivity separately from human productivity, which makes sense given the shift underway.

On the performance front, Google announced general availability of Lightning Engine for Managed Service for Apache Spark. It delivers up to 4.9 times faster performance than standard open-source Spark, and twice the price-performance compared to the leading high-speed alternative. The interesting context here is the agentic era. When autonomous agents are triggering thousands of concurrent multi-hop queries, the performance of your data processing layer directly affects your unit economics. Lightning Engine works with zero changes to existing pipelines, available in both serverless and managed cluster modes. Not a trivial upgrade.

Google also published work on DiffusionGemma, a variant that generates text using diffusion rather than the traditional left-to-right token-by-token approach. Most language models work like a typewriter — predicting one token at a time. That approach is efficient in the cloud where you can batch thousands of requests together. But on a local machine for a single user, it leaves your GPU underutilized, just waiting for the next keystroke. DiffusionGemma instead drafts an entire 256-token block all at once — like swapping a typewriter for a printing press. This gives your hardware a much larger chunk of work simultaneously and produces about four times faster text generation for local inference.

Then there's the piece that should concern everyone. Anthropic published research showing that their Claude Mythos Preview model can turn public software patches into working exploits within hours. They tested it against recently patched vulnerabilities in Mozilla Firefox and the Microsoft Windows kernel — flaws that had already been fixed, selected because they postdated the model's knowledge cutoff. In the Windows kernel test, Mythos Preview generated proof-of-concept crashes for 18 of 21 vulnerabilities, and built 8 exploit chains escalating from low-privilege user to SYSTEM-level control. All within six hours. The average cost was around two thousand dollars per exploit in API credits. Notably, 14 of those 21 Windows vulnerabilities had been rated by Microsoft as "Exploitation Less Likely" or "Exploitation Unlikely" — ratings calibrated around human researchers. AI systems at this level challenge those assumptions. The Firefox test showed a median patch gap of 19 days before Mozilla's stable release containing the fix. Mythos Preview generated working exploits for 14 of 18 Firefox patches, with the first complete exploit done in under an hour — while the fix was still 18 days from shipping. Windows Autopatch typically takes seven days to reach 90 percent of devices. Anthropic's system completed all eight full exploit chains before that seven-day mark. This isn't about zero-days — it's about the N-day window, the gap between when a patch is released and when it actually lands on systems. That gap is shrinking from months to hours. The implication for security teams is clear: automated, fast patch deployment is no longer optional.

Wrapping up, a quick note on Gemini 3.5 Live Translate for natural voice translation. Gemini 3.5 Live Translate handles fluid, natural voice translation in real time. Seroter notes his kids are growing up in a world where anyone can understand anyone regardless of language. That's a profound shift, and it points to where these models are heading — real-time, fluid, conversational cross-language communication.

Overall, a few threads connect today's set. Agent experience is becoming its own discipline, with practitioners recognizing that context management for AI agents requires the same rigor as code quality. The productivity numbers from DX are a reality check on how far AI adoption still has to go beyond just writing code. And the Anthropic research is a sharp reminder that the dual-use problem with AI in security isn't theoretical — it's here, it's fast, and it changes the economics of the patch gap dramatically.

That's episode 802. See you tomorrow.

[blog] Doing nothing at work — https://www.seangoedecke.com/doing-nothing-at-work/
[blog] Routing AI responses between local runners — https://davenporter.substack.com/p/wibtdml-routing-ai-responses-between
[blog] Developer experience is dead. Long live agent experience. — https://www.builder.io/blog/agent-experience
[blog] Lovable says it has hit $500M in annualized revenue, with 1 million new projects a week — https://techcrunch.com/2026/06/09/lovable-says-it-has-hit-500m-in-annualized-revenue-with-1-million-new-projects-a-week/
[blog] Build your own Flutter GenUI solution with Gemini structured outputs — https://medium.com/flutter-community/build-your-own-flutter-genui-framework-with-gemini-structured-outputs-a6db3653b9b6
[blog] The current impact of AI on engineering velocity — https://newsletter.getdx.com/p/the-current-impact-of-ai-on-engineering
[blog] Deep dive: How Lightning Engine delivers 4.9x faster Apache Spark performance — https://cloud.google.com/blog/products/data-analytics/lighting-engine-for-apache-spark-performance-deep-dive/
[blog] DiffusionGemma: 4x faster text generation — https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/
[article] Anthropic says AI can turn software patches into exploits within hours — https://www.developer-tech.com/news/anthropic-ai-software-patch-exploits/
[blog] From Gemini CLI to Antigravity CLI: Automated OWASP Security Compliance and Agentic Remediation in Your Terminal — https://medium.com/@terraccianosaverio/from-gemini-cli-to-antigravity-cli-automated-owasp-security-compliance-and-agentic-remediation-in-1874b5982dd4
[blog] Fluid, natural voice translation with Gemini 3.5 Live Translate — https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-live-3-5-translate/