Seroter's Daily Reading — #803 (June 11, 2026) — Seroter's Daily Reading

Listen: https://blossom.nostr.xyz/32dcde6c5f98df9af153732f36a6c1ddcea7bd35cbd7984323a5a430bb786365.mpga

Seroter's Daily Reading episode 803, June 11, 2026.

Let's dig in.

Starting with a piece from Elena Verna on Your AI strategy has a trust problem, not a tooling problem. She makes a point that lands pretty hard: most companies already have the technology they need to move faster. The blocker isn't the tools, it's the company systems designed to prevent things from happening. Approval cycles that exhaust everyone, rigid role boundaries, title-based hierarchies that gate access to information. It all sends the same message: we don't trust you, stay in your lane.

Her argument is that if you don't give employees actual agency, meaning real access and autonomy to make decisions on behalf of the organization, your AI investment goes nowhere. The interesting move is comparing this to how AI-native companies operate. Anthropic, she points out, basically runs on "member of technical staff" and "member of non-technical staff." Not because titles are evil, but because they become permission structures that decide who gets context, who gets heard, and who gets to make decisions. That's exactly the kind of drag AI-native companies are trying to avoid.

The other thread here is that this isn't just an external customer trust problem. You can't build trust with customers if you don't have trust inside your own company first. Build in public, react quickly to user needs, share failures honestly. All of that requires trusting your employees enough to let them do it, and giving them the access to make it happen.

The Google Cloud blog digs into the DORA research on How to unlock true ROI in software development. The key insight is the J-curve: most organizations hit a temporary productivity dip when adopting AI tools before they see real returns. Why? Three reasons. The learning curve as teams adapt workflows. The verification tax, where you have to rigorously review all the code AI generates. And pipeline adaptation, since downstream processes like testing and change approvals become bottlenecks when individual developers suddenly generate way more code.

The take is that budgeting for this early phase is key, and knowing it's normal rather than a sign your strategy is failing. That initial dip is an investment in long-term speed.

Loop engineering is the hot topic this week, and we've got three pieces on it. Addy Osmani's piece on Loop Engineering is the deep dive. The basic idea is replacing yourself as the person who prompts the agent. Instead of sitting there, typing a prompt, reading the output, typing the next prompt, you design a system that does that loop on its own. He describes it as moving from operating a lathe to designing the production line the lathe sits on.

The loop needs five components plus memory. Automations that fire on a schedule and do discovery and triage. Worktrees so multiple agents can run in parallel without clobbering each other's files. Skills to codify project knowledge so you stop re-explaining your project every session. Connectors to plug the agent into your real tools like issue trackers and Slack. And sub-agents to split the maker from the checker, since a model grading its own output is way too generous.

The memory piece is the spine of the whole thing. A file on disk, or a Linear board, that holds what's done and what's next. The model forgets everything between runs, but the repo doesn't. Both Codex and Claude Code now ship all six primitives, which is why this pattern has a name now instead of being some custom bash hack.

Osmani's honest about what loops still don't do for you. Verification is still on you. A loop running unattended makes mistakes unattended. And comprehension debt is the real danger: the faster the loop ships code you didn't write, the bigger the gap between what exists and what you actually understand. He calls the trap cognitive surrender, where it's tempting to just take whatever the loop gives back without having an opinion. Two people can build the exact same loop and get opposite outcomes. One moves faster on work they understand deeply. The other uses it to avoid understanding the work at all. The loop doesn't know the difference. You do.

The Daily Dose of DS piece on Loop Engineering: Design the System That Prompts Agents covers the same territory with the maker-checker split as the most consequential design choice. The key reminder is setting the exit condition before you start the loop. A loop with no stop condition churns and burns tokens fast. Decide on something like "fix major issues only, run one final pass, then stop after two loops with all tests passing and lint clean" before you kick it off, not while it's running.

And a piece from The New Stack on the same topic, pulling in Boris Cherny from Anthropic on The Anthropic leader who built Claude Code says he ditched prompting. He stopped prompting Claude directly. His job now is writing the loops. The building blocks map almost identically between Codex and Claude Code, which means the assembly is done and the pattern is real. The piece notes that the vendor who makes loop definitions portable will be in a strong position.

Then there's a piece on How Gemini Managed Agents Works under the Hood from Phil Schmid showing this in action. Five lines of code, one API call, and you get back a finished PDF with charts and a summary. Behind that call, a sandbox boots, skills load, and the model enters a loop where it reasons, picks tools, executes code, reads the output, and repeats until the task is done. It's loop engineering as a managed service, basically. An execution loop running in an isolated Linux container with 4 vCPUs and 16GB of RAM.

An interesting data point from CIO Dive on Employees spend more time managing AI than producing work. The work is changing, and that doesn't mean the current state is permanent, but it's worth noting as context for how AI adoption is actually going for a lot of teams right now.

From the LDX3 engineering leadership conference on Engineering leadership lessons from LDX3 2026, a few themes worth pulling out. Engineering managers are getting squeezed. Scope is increasing, direct report counts are going up, working hours are longer, and one in three managers are actively considering going back to individual contributor work. The driver isn't only AI, it's that budget that previously funded headcount is now going to GPUs. Same manager, covering more with fewer people underneath.

Nicole Forsgren from Google gave a talk on how AI is amplifying existing friction. Before AI coding tools, engineering systems had friction everywhere, but natural speed limits meant that friction wasn't the bottleneck. Now that engineers can produce code dramatically faster, the friction that was always there has become the thing slowing everything down. She mapped it into three types: velocity friction like slow builds and manual approval gates, cognitive friction like unclear criteria and constant context switching, and knowledge friction where engineers can't find architectural intent or understand adjacent systems.

The last one is the one that stuck with me. AI can generate the code. It cannot generate the context. And in most teams, context lives in someone's head, which means it's not really accessible at all. That connects back to Verna's piece on trust, actually. If information is gated behind org structure and hierarchy, AI can't help you access it either.

On hiring, a talk from Meta on how they're handling candidates using AI during interviews. They went back to first principles and rebuilt the whole thing. They give candidates a full codebase, an AI assistant they're allowed to use freely, and the interviewers assess things that don't change regardless of the tool: how candidates explore a problem before diving in, the quality and maintainability of the code they produce, how they validate and question the AI's output rather than just accepting it, and two layers of communication. How they talk to the interviewer verbally, and how they write prompts to the AI. They ran nine thousand interviews using this format before going fully live in April.

On infrastructure, Google Cloud published benchmark results on their GKE Inference Gateway showing up to 92% faster AI responses on Report: GKE Inference Gateway delivers up to 92% faster AI responses. The secret is prefix caching, storing the KV cache of long repetitive prompt prefixes so the model skips reprocessing tokens it already handled. Snap is seeing prefix cache hit rates ranging from 75 to 80% using this approach.

And from The New Stack, data on AI teams now deploy 1,000 times a month. Your pipeline wasn't built for that.. Your pipeline probably wasn't built for that. If you're still deploying once a week, you're competing against teams doing it 175 times more frequently. But the piece is careful about the caveat: speed without direction is wasted motion. They use a bullseye model to describe this. The goal isn't just more code, it's moving toward the ideal product. More shots at the target is great, but you have to know where the arrows are landing. If your pipelines are groaning under AI-generated code volume and you're not seeing the feedback loops that tell you whether you're moving in the right direction, you have a problem.

If you want to actually deploy an AI agent, there's a step-by-step tutorial on How to deploy a Google Agent Development Kit (ADK) agent to Google Cloud Run. Five lines of code gets you a running agent, you configure environment variables, run the ADK deploy command, and you've got a serverless agent handling real-world traffic.

Finally, Red Monk on The Unbundling and Bundling of the PaaS Market. Twenty years ago, IaaS won over PaaS because it looked like what enterprises were used to, while PaaS was opaque and constrained. But now the number of primitives has become a burden, and coding assistants prefer abstract platforms anyway because they're easier to programmatically manipulate. So we have this landscape of abstractions: AI app builders like Bolt and Lovable, general-purpose PaaS like Cloud Run and Fly.io, front-end platforms like Cloudflare and Vercel, backend-as-a-service like Firebase and Supabase. The interesting move is that these categories are colliding. Lovable is going from app builder to PaaS. Replit is going from PaaS to app builder. Convex is going from BaaS to app builder. And this re-bundling is happening within a 25-month window. The unbundling inevitably came for PaaS, just like it once did for databases. What followed databases was rebundling into multi-workload databases. Expect the same here.

And wrapping up with a positive note from Google on Growing the next generation of American workers. They're investing 50 million dollars through Google.org to help prepare over 300,000 American workers for skilled trades careers. Welders, pipefitters, electricians, fiber technicians. It's the builder era, and that includes all types of builders.