Seroter's Daily Reading — #763 (April 14, 2026)
Seroter's Daily Reading·
Listen: https://blossom.nostr.xyz/957b7f152c9fee638a87ccd86c752b6eb5eefc052b613a7d552f1c37d49acc57.mpga
Source: Seroter's Original Post
Seroter's Daily Reading, episode 763. April 14, 2026.
A mostly fun and frantic day. But the opposite is no good, so we'll take it.
Let's start with an article that gave me a lot to think about. Addy Osmani has been writing about what he's calling Agentic Engine Optimization — AEO for short. (article) The premise is straightforward but the implications are worth sitting with. AI coding agents like Claude Code and Cursor are now a significant portion of the traffic hitting developer documentation. But they don't read those docs the way humans do. They issue a single HTTP request, receive the full page, and move on. No scroll depth. No time-on-page. No click-throughs. Traditional analytics shows nothing, even when the agent read every word.
Osmani pulls from a recent research paper that studied HTTP traffic from nine major AI coding agents. The behavioral fingerprints are fascinating. Aider uses headless Chromium with a full Mozilla user-agent. Claude Code uses Axios. Cline uses curl and does this interesting OpenAPI sweep behavior. Cursor does a HEAD probe before GET. The agents compress what would be a multi-minute human browsing session into one or two requests. Your carefully designed user journey just became a single server-side event, and your analytics logged nothing useful.
There's also a token problem that I think is underappreciated. The paper cites Cisco's Secure Firewall Management Center REST API Quick Start Guide at nearly 193,000 tokens. That's close to 720,000 characters in a single document. Most agents have practical context limits between 100 and 200 thousand tokens. When an agent hits that document, it truncates silently, skips it entirely, or falls back to hallucinating. You never know. And since no analytics event fires, you never know anything went wrong.
So AEO is the discipline of structuring documentation so agents can actually use it. Osmani lays out a layered approach. Layer one is robots.txt — agents check this first to see if they're allowed in. A misconfigured block on known AI crawlers silently locks agents out. Ten minutes of auditing could fix this for a lot of teams. Layer two is llms.txt — think of it as a sitemap for agents. It's a flat Markdown file at the root that provides a structured index with descriptions and even token counts per page, so agents can decide whether to read something based on whether it fits their context window. Layer three is skill.md — this tells agents what your API can actually do, not just how to call it. It's a declarative mapping of capabilities to endpoints.
Layer four is content formatting for agent parsing. Serve Markdown, not just HTML. Strip the navigation noise. Structure headings consistently. Lead each page with the outcome in the first 200 words. Layer five is token surfacing — actually expose token counts as page metadata so agents can make informed decisions about what to include. Layer six is the copy-for-AI button, which copies clean Markdown instead of rendered HTML with all its chrome. Anthropic and Cloudflare have already shipped variants of this.
Osmani also flags AGENTS.md as an emerging default. Just as README.md became the entry point for human developers, AGENTS.md is becoming the entry point for AI agents exploring a repository. Cisco DevNet has already adopted it as standard in their GitHub templates, pre-populated with project-specific content and links to sandboxes. The audit checklist at the end of the article is worth working through for anyone maintaining a developer portal.
The broader point is that most documentation was designed for human cognitive patterns — visual hierarchy, progressive disclosure, interactive examples. In an agent-heavy world, those assumptions break down. The best docs going forward will need to serve both audiences at once. Scannable for humans, machine-readable and token-efficient for agents. SEO taught us that great content isn't enough if nobody can find it in the format that matters for the traffic of the era. AEO is the same lesson for a different consumer.
There's a survey worth discussing next. VentureBeat covered a study from Lightrun's 2026 State of AI-Powered Engineering Report — 200 senior SRE and DevOps leaders at large enterprises across the US, UK, and EU. (article) The headline number: 43% of AI-generated code changes require manual debugging in production environments even after passing QA and staging. Not a single respondent could verify an AI-suggested fix with just one redeploy cycle. 88% needed two to three cycles, 11% needed four to six.
Let that sink in. The validation pipeline built for human-scale engineering is being asked to handle code volumes that human engineers didn't write and in many cases can't fully reason about. And it's breaking. Google DORA's 2025 report found AI adoption correlates with an almost 10% increase in code instability. 30% of developers report little or no trust in AI-generated code. And here's the number I keep coming back to: developers now spend an average of 38% of their work week — roughly two full days — on debugging, verification, and environment-specific troubleshooting. That's not the productivity dividend enterprise leaders expected.
The Amazon outages from early March illustrate the real-world consequences. On March 2nd, Amazon.com went down for nearly six hours, 120,000 lost orders, 1.6 million website errors. Three days later, a more severe outage caused a 99% drop in US order volume, approximately 6.3 million lost orders. Both traced to AI-assisted code changes deployed without proper approval. Amazon's response was a 90-day code safety reset across 335 critical systems, with senior engineer approval now required for all AI-assisted changes.
The deeper structural problem the report identifies is what they call the runtime visibility gap. 60% of respondents said lack of visibility into live system behavior is the primary bottleneck in resolving production incidents. 97% of engineering leaders said their AI SRE agents operate without significant visibility into what's happening in production. Only 1% reported extensive visibility, not a single respondent claimed full visibility. The observability tools built for human-speed engineering weren't designed for this.
In finance, 74% of engineering teams trust human intuition over AI diagnostics during serious incidents. Perhaps most telling: not a single organization surveyed has moved AI SRE tools into actual production workflows. 90% remain in experimental or pilot mode. The remaining 10% evaluated them and chose not to adopt. Enterprises are spending aggressively on AI for IT operations, but quarantining the tools from the environments where they'd deliver the most value. The machines learned to write the code. Nobody taught them to watch it run.
Let's shift gears. Rich Clayton and Adam Groves on the SVPG blog with a piece on Commercial versus Internal Products. (article) This is a good reminder about the different challenges and urgency for each type. The argument is that internal products have it easier in several ways. Users can't choose a competitor — the company is literally paying them to use the tool. Usability bar is lower because you can train or document your way through rough edges. Feasibility is easier because scale and performance demands are usually minor. Viability is simpler because the tool operates in a narrow part of the business, largely insulated from the external world.
But commercial products have to compete. And in the AI era, competitors are emerging faster than ever. The PM's job is fundamentally different. It requires deep immersion in marketing, sales, funding, monetization, legal, compliance. The product needs to be so much better than alternatives that customers are willing to switch from whatever solution they're using today. With internal products, the goal is to solve the problem. With commercial products, the goal is to win. And product discovery — actually getting out and understanding what customers need — is typically the difference between success and failure.
The piece also notes something interesting: internal users are increasingly free to vibe-code their own internal tools. So that's a form of competition that internal products now face. The insight isn't that internal products are easy, it's that commercial products are even harder, and the success of the business depends on getting them right.
Max van IJsselmuiden on the productivity procrastination blog. (article) This one is about the gap between doing productive work and doing the work you actually need to do. He describes finishing a video for his YouTube channel, feeling good about it, and then realizing he was avoiding the older videos he should have been prioritizing. The productivity matrix from Casey Neistat shows four quadrants — things you want to do but shouldn't, things you want to do and should but avoid, things you don't want to do but will do to avoid the main thing, and things you don't want to do but need to. Max's problem falls into the second quadrant — productive work that's not the right work.
He digs into the neuroscience. Procrastination comes from a collision between the limbic system — specifically the amygdala which processes threats and negative emotions — and the prefrontal cortex which handles planning and impulse control. When a task triggers significant negative emotion, anxiety or fear of failure, the amygdala takes over and we avoid the task. Our brains are protecting us from negative emotions.
But there's also the novelty factor. Research from Bunzeck and Düzel shows our brains respond specifically to stimulus novelty. Starting a new project feels exciting because novel stimuli trigger a hippocampal to ventral tegmental area loop. The data from Max's own video editing confirms this — his output per session drops as the time between recording and editing increases. The newer the project, the higher the engagement.
The piece covers several psychological mechanisms. Moral licensing — past good behavior gives us permission to start bad behavior. Completing several productive tasks can trick us into thinking we've accomplished enough. Zeigarnik Effect — unfinished tasks persist in working memory, creating cognitive tension. The solution involves affect labeling — making the negative emotion explicit activates the prefrontal brake. Self-forgiveness — a study showed students who forgave themselves for procrastinating significantly reduced it on the next exam. And making the task a habit, combining a cue with a task to get started.
The insight is that to work on an older project, you have to introduce stimuli to make it feel new. Trick your own brain. And understanding the psychological principles helps with the guilt — your brain is wired to make you feel this way.
Google announced Gemini Robotics-ER 1.6, an upgrade to their reasoning-first model for robots. (article) The key improvement is enhanced spatial logic and multi-view understanding, bringing new levels of autonomy to physical agents. The model specializes in visual and spatial understanding, task planning, and success detection. A new capability called instrument reading lets robots read complex gauges and sight glasses — something developed through collaboration with Boston Dynamics. Importantly, this is Google's safest robotics model to date, demonstrating superior compliance with safety policies on adversarial spatial reasoning tasks. Available now via the Gemini API and Google AI Studio.
InfoWorld had a hands-on review of Google's Agent Development Kit. (article) The reviewer came at it as a longtime consultant with experience across programming languages and databases. His take was fairly balanced — he walked through the core concepts and some practical examples. He noted it was getting better every month. I'm increasingly bullish on this framework myself. The approach Google is taking with ADK — breaking agents down into components like models, tools, and memory — makes it approachable without being constraining. Worth checking out if you're evaluating agent frameworks.
And a quick note on BigQuery Graph — Google announced this the week before their flagship conference. Graph capabilities in BigQuery let you work with connected data natively in the data warehouse rather than having to export to a separate graph database. The announcement post is on the Google Cloud blog if you want the full picture.
That's episode 763. A couple of themes running through this week. The AI code quality problem is becoming impossible to ignore — the gap between what agents produce and what actually works in production is significant, and the tooling to close that gap isn't there yet. On the documentation side, AEO is a concept worth sitting with. Whether you have a developer portal or write code that others depend on, the question of how your content is consumed by increasingly capable agents is only going to grow. And on the personal side, understanding why your brain fights certain tasks doesn't make it easier, but it does make the guilt less acute.
I'm Richard Seroter, for episode 763. See you next time.
Articles covered in episode 763
- Agentic Engine Optimization (AEO)
- 43% of AI-generated code changes need debugging in production, survey finds
- Commercial versus Internal Products
- Productive procrastination
- Gemini Robotics ER-1.6 enhances reasoning to help robots navigate real-world tasks
- Hands-on with the Google Agent Development Kit