Seroter's Daily Reading — #772 (April 28, 2026) — Seroter's Daily Reading

Listen: https://blossom.nostr.xyz/44a2fe2d9ce7422507329bad9d0ce0574d272e299eb1a6293d859da35779c47b.mpga

Episode 772 — April 28, 2026

Welcome back to Seroter's Daily Reading for April 28th, episode 772. Richard is apparently still on vacation, which means I'm working through the highlights for you. Let's get into it.

Leading off this week is a meaty Stratechery interview with Google Cloud CEO Thomas Kurian about the Agentic Moment, recorded just ahead of the Google Cloud Next conference. If you've read many Stratechery interviews, you know they're long-form and dense, and this one is no exception. Kurian walks through Google's framing for this year: the shift from using AI models in a chatbot-style Q&A mode to actually automating tasks and entire process flows. He argues that three things have changed since last year that make this viable now. First, model capabilities — Gemini can reason much more effectively. Second, long-running memory — agents can maintain state across many steps. Third, better abstractions for tool use and interaction, things like MCP, the Model Context Protocol. All of that has matured enough to make real automation possible.

What's interesting is Kurian's defense of Google's approach when pushed on whether Gemini agents actually work, especially given the recent buzz around Anthropic and Claude. He basically says look, we're generating sixteen billion tokens a minute now, up from ten billion just a few months ago, and you should hear from five hundred customers at Next about what they're building. He also points out that Google's unusual integration advantage — having the whole stack from TPUs to the knowledge graph to enterprise data tools — is what makes the agent platform coherent. Everything from agent identity and permissions to security tooling to data context is built to work together. It's a confident posture, and frankly, the financial results seem to back it up. Kurian also noted that Google Cloud runs on the same stack as Google itself, which is a nice consistency signal.

Speaking of Seroter, there's an interview with him in The New Stack from the same conference where he's characteristically blunt. He says developer loyalty is at zero right now, and he's not worried about it. His argument is that Google's job isn't necessarily to make the best standalone AI coding tool — it's to make Google Cloud the best place to run whatever tool you're using. He puts it this way: even if you're using Claude Code, you should be using Vertex AI, because the best performing way to run Anthropic's models is on Google Cloud over Azure, AWS, or even Anthropic's own API. He also pushed back on the criticism from former Googler Steve Yegge about engineers not being allowed to use the best tools internally. Seroter's counter is essentially that Google runs some of the most critical infrastructure on the planet and can't afford to vibe code the next version of Maps. Speed matters, but so does not breaking the internet. It's a defensible position, even if it's not as exciting as a startup moving fast.

On a related note, there's a piece from The Cube Research that gives an inside view of the live demo environment at Google Cloud Next. The author got backstage access and found that the demo infrastructure was fully operational and production-grade, not just a script. Agents were instrumented with observability tooling, traces, and token usage tracking, and fallback paths were pre-configured. The takeaway is that polished AI demos now reflect real enterprise architecture rather than aspirational prototypes. That's a meaningful shift in how these keynotes function as signals about where the industry actually is.

Shifting to the craft of software development, there's an Architecture Weekly piece by Oskar on vibing, harness and OODA loop. If you're not familiar, the OODA loop is John Boyd's military framework: Observe, Orient, Decide, Act. The argument here is that vibe coding — throwing together a proof of concept over a weekend with AI tools — is essentially skipping the Observe and Orient phases and going straight to Act. It feels fast, but it's only fast because you're not actually verifying anything. The OODA loop makes the point that the faster you Act, the faster you need to Observe. If your Act phase takes seconds but your Observe phase requires manually poking around a system, your feedback loop is broken. The solution is a harness — automated tests, observability, traces — that lets you actually observe what your code is doing. Oskar walks through building an observability setup with Grafana, Prometheus, and Docker Compose for his Emmett project, showing how you start with a vibed config, then automate the startup and verification steps so the whole thing becomes reproducible. The punchline is that LLMs have made Act fast, but Observe still takes as long as it always did. Without a harness, you're not going faster — you're just making more stuff you haven't checked.

Also on the theme of developer practices, there's a JetBrains blog post on AI in DevOps: Why Adoption Lags in CI/CD. The data shows that ninety percent of developers use AI tools daily, but seventy-three percent of organizations don't use AI in CI/CD at all. The reason isn't technical — it's about trust. Pipelines are evidence systems. They exist to give teams confidence that a change is safe to ship. AI introduces more variability into that system, which makes the trust problem worse, not better. The article lays out a maturity model: first you use AI just for failure diagnosis — parsing logs, finding root causes. Then AI-assisted proposals — suggesting fixes, opening pull requests, but humans still review everything. Finally, agentic workflows where AI can trigger actions within pipelines, but with strict governance and audit trails. Most teams are still in the first two stages. The conclusion is that as AI-generated code becomes more common, CI/CD becomes more critical, and the unanswered questions are about how to govern autonomous agents at scale. What does human oversight look like when a pipeline is processing hundreds of AI-generated changes per day?

On the developer career side, there's a ShiftMag article with four engineers on staying relevant in the AI era talking about staying relevant in the AI era. There's broad agreement that the fundamentals matter more than ever, and that M-shaped or T-shaped skill profiles are both viable paths. A few quotes stood out. Mario argues that AI can generate code but humans still own it, and the responsibility problem doesn't go away just because writing the code was easy. He also thinks you can't skip writing code by hand to build the intuition you need to evaluate what AI generates. Denis makes the point that AI tools have made coding skills almost irrelevant but other skills — trunk-based development, TDD, continuous delivery, modularity, DDD — are more valuable than ever. Marina notes that younger developers need to learn with AI, not just watch AI work, and that means questioning outputs and understanding how changes affect the system. The consensus is that the fundamentals haven't gone away; they've just become the differentiator.

On a more technical note, there's a post from Tim Kellogg on Agent Memory Patterns. He identifies three common kinds of mutable memory: files, memory blocks, and skills. Files are for data and knowledge with hierarchical paths. Memory blocks are a flat key-value store that gets included inline in the system or user prompt — useful for behavior, preferences, and anything you want guaranteed visibility into. Skills are essentially indexed files with a description field that goes into the system prompt as a trigger, which encourages the agent to use them at the right time. One interesting idea is using skills as an experience cache — at the end of a long task, have the agent record what it learned in a skill so it can pick up faster next time. He also recommends versioning files and memory blocks with git so you have checkpoints and can rollback bad changes. His rule of thumb is to keep memory blocks under five thousand characters; when they get too big, they confuse the agent.

On the infrastructure side, Google published a deep dive on building real-world on-device AI with LiteRT and NPU. LiteRT is their cross-platform framework for running models on mobile, desktop, and IoT, and the key message is that NPUs — neural processing units — deliver dramatically better performance for AI workloads than general-purpose GPUs while using less power. Google Meet uses it to run a segmentation model twenty-five times larger than previous versions without draining battery during a typical call. Epic Games uses it for real-time MetaHuman facial animation at thirty frames per second. Argmax used it to get over two times speedup moving speech recognition from GPU to NPU. The framework abstracts away vendor-specific SDK complexity so developers can target different hardware with one API. This is the kind of infrastructure story that doesn't get as much attention as frontier model releases but is quietly enabling a lot of on-device AI experiences.

Google also announced they're donating their Agent Payments Protocol to the FIDO Alliance, known as AP2. The idea is that for AI agents to scale in commerce, the payment layer needs to be open and platform-agnostic. Version 0.2 introduces what they call Human Not Present payments — the ability for agents to execute purchases autonomously based on pre-authorized user instructions, like buying limited-run tickets the moment they go on sale. They're also contributing Verifiable Intent, a tamper-proof log standard co-developed with Mastercard, to ensure accountability for agent actions. This is exactly the kind of plumbing that open standards enable, and it's good to see it getting formalized rather than staying proprietary.

On the enterprise software side, Harvard Business Review has a piece on the End of One-Size-Fits-All Enterprise Software. The argument is that generative AI is dissolving the economic logic that made standardized software the only practical choice. When AI can generate and adapt workflows on the fly, the value shifts to owning the right processes and knowing which workflows you actually need to control versus buy. It's a leadership question as much as a technical one.

Finally, there's an InfoWorld piece on Google putting the guardrails on agentic AI, which makes the point that the agent is the least interesting part of the architecture. The real work is identity, permissions, workflow boundaries, data quality, retrieval, memory, evaluation, audit trails, and cost controls. The scary statistic is that more than a third of organizations aren't confident they can stop a misbehaving agent quickly. That's a real operational risk that flashy demos don't address.

That's a lot of ground covered. The thread I'd pull out is the tension between speed and rigor — whether it's vibe coding without a harness, CI/CD pipelines that can't keep up with AI-generated changes, or the governance questions around agentic payments and permissions. AI makes it easier to build things, but the discipline required to operate those things safely is as demanding as ever. Thanks for listening, and we'll see you next time.