Seroter's Daily Reading — #817 (July 2, 2026) — Seroter's Daily Reading

Listen: https://blossom.buildtall.systems/5fb58dc0b5253242f52b0d18df74288c534850e6d2e314aef8997127dcbaf3c3.mpga

Seroter's Daily Reading, episode 817, July 2, 2026.

Today is the day after a holiday here in the US, so I hope everyone had a good one. I'm recording this from a slightly different vantage point than usual, having spent part of the day in the mountains chasing a taco truck down a beach road. There is no better feeling than a completely aimless drive through somewhere beautiful with good food at the end of it. This is why I will never, ever want a self-driving car.

Let's get into the reads. This batch has a strong theme running through it: the infrastructure and culture needed to actually ship things with AI, not just talk about it.

First up, a piece on 7 reasons experienced engineering managers get stuck. This one hits because it applies beyond engineering management. The author walks through traps like mistaking your title for your actual impact, confusing team size with influence, and doing things the way your last company did them. My favorite trap here is number four, about absconding rather than delegating. The author shares feedback they got from an engineer: "when I ping you with a question and you say 'whatever you think, I trust you,' that's not actually helpful. I pinged you for a reason." There is a version of trust that means staying involved, not disappearing entirely. Good stuff for anyone trying to grow as a leader.

Next, Werner Vogels writes about a return to two-pizza culture at Amazon, but with a twist. The classic two-pizza rule was about keeping teams small enough that everyone knew what everyone else was working on without needing meetings. But Vogels makes the case that writing-first product development, the famous working backwards process, needs an update. When a small team of scientists realized they all wanted to build an agentic operating system back in January, one of them spent a single night using a coding agent to build the first prototype. Within a week they had executive support. Within two weeks they had a proper team. Vogels' point is that you learn more in one evening of building than in two weeks of writing about what you think will happen. So the new working backwards starts with a prototype, not a PRFAQ. You build it, use it, find the gaps, and then write the document. The writing is still important, but it's now grounded in something real rather than assumptions. Two pizzas were always about ownership culture, and Vogels argues the tools have finally caught up.

Speaking of learning loops and agents, Andrew Brogdon wrote about how he used Google's Antigravity to build Flutter frontends for agents built with the Agent Development Kit, which he had never touched before. He came up with a five-step iterative loop: execute the current skill, evaluate the output, identify gaps, update the central skill file, and then wipe everything and try again. It took thirteen iterations to get something worth sharing. The interesting part is that he used two agents in parallel, an author agent to refine the skill and a coder agent to use it. The skill isn't just instructions for the coder agent, it's a record of what he was learning. Each iteration produced notes files that structured how the next coder agent thought about the task. This is a practical example of the learning-in-public pattern done right.

On the topic of AI coding tools, Z.ai launched ZCode, a free desktop IDE built around their GLM-5.2 model. This is interesting for a few reasons. GLM-5.2 is a 744 billion parameter mixture-of-experts model trained entirely on Huawei silicon, costing an estimated 25 million dollars. It ranked second globally on Code Arena as of mid-June, trailing only Anthropic's Claude Fable 5. The timing was deliberate. On the same day the US government suspended foreign access to Anthropic's most advanced models, Zhipu open-sourced GLM-5.2 under the MIT license. The market responded: Zhipu's market cap crossed 128 billion dollars. ZCode itself is free, with revenue coming from a subscription plan that undercuts Claude Code by a significant margin. It's also the only major AI IDE with WeChat and Feishu integration for remote control, which speaks directly to the Chinese developer market. Whether a Chinese company can build trust with Western enterprise buyers remains an open question, but the competitive dynamic has definitely shifted.

On the Google side, they announced two new models for the Gemini Enterprise Agent Platform: Nano Banana 2 Lite and Gemini Omni Flash. These are aimed at creative teams that need to move fast and reduce regeneration time.

Over on the security side, Nordic APIs published a guide to the OWASP MCP Top 10. Model Context Protocol has over 10,000 active servers and 97 million monthly SDK downloads, so this is important. The ten vulnerabilities range from token mismanagement and secret exposure to tool poisoning, where an attacker corrupts the metadata of an MCP tool that the LLM then interprets as legitimate. There is a supply chain attack case where a compromised Postmark MCP server harvested user emails. Another interesting one is intent flow subversion, where an attacker embeds harmful instructions inside an agent's context to make it exfiltrate data. The guidance recommends treating all tool descriptions and outputs as untrusted input, using short-lived tokens, and running MCP tools in sandboxed environments. This is the security posture you need to be thinking about if you're integrating MCP at scale.

Google also wrote up AlloyDB AI Functions with a new performance capability called Smart Batching. The problem they were solving is that running an LLM call for every row in a massive database is expensive and slow. Smart Batching deduplicates prompt overhead, sending the LLM's boilerplate instructions once per batch rather than repeating them for every row. AlloyDB intelligently calculates the right batch size automatically. Internal testing showed up to a 2,400 times performance boost, processing 10,000 rows per second. This is currently available for the ai.if and ai.rank functions.

Moving to something completely different, Coté summarized a peer-reviewed study on what actually performs well on LinkedIn. The headline finding is an inversion: the content people post most, like self-promotion and thought leadership, performs worst. The content people post least, like praising others and marking observances, performs best. Interpersonal posts, where you thank or congratulate a specific person, won more head-to-head engagement comparisons than any other category. The researchers' leading theory is mechanical: tags trigger notifications, which extends reach. The advice is simple, almost counterintuitive. Put the insight in the post itself, not behind the link. Link-heavy posts underperformed, presumably because readers click away before reacting. The study also notes that the parable format, the recruiter who taught you about resilience, is the single most mocked format on the platform.

Google also published a post on why they built ADK 2.0, focused on bringing deterministic execution to agents through something called workflows. The core argument is that LLMs are trained for variety and creativity, which is great, but business processes require exact execution. If you know that step B always follows step A, there is no reason to pay the LLM to infer that. ADK 2.0 lets you build a workflow graph where deterministic steps like tool calls and human-in-the-loop checkpoints run at programmatic speeds, and you only invoke the LLM for steps that actually require cognitive reasoning. Their example is a customer refund workflow: fetch purchase history, analyze the complaint with an LLM, issue the refund deterministically, draft an email with an LLM, and update the CRM. By confining the LLM to just two nodes, they saw roughly 50% token savings and 20% latency reduction in benchmarks. They also argue this approach is more secure because even if an LLM node is manipulated with a prompt injection, the workflow graph lacks the pathways to execute unauthorized actions.

Finally, Harvard Business Review on the urgency trap in AI strategy. A MIT report found that 95% of generative AI projects fail. A large-scale survey of over 6,000 senior executives across the US, UK, Germany, and Australia found that roughly 90% reported no measurable productivity improvement from AI over the last three years. The problem is not that AI doesn't work. The problem is how leaders think about it. Rushing headlong in the wrong direction rarely pays off. Purpose matters more than anything else.

That's the batch for today. A few recurring themes worth noting: the tension between moving fast with AI tools and maintaining the rigor needed for reliable outcomes, the growing importance of governance and security as these systems scale, and the ongoing evolution of how teams should actually work when agents are in the loop. Thanks for listening, and I'll see you next time.