McKinsey has rolled out AI tools to over 40,000 consultants, automating research synthesis and first-draft writing. Productivity benchmarks show 20-30% time savings on research tasks, though client judgment and relationship work remain firmly human. Worth reading for the honest assessment of what automation can and cannot replace.
Google's official breakdown of what changed in the Gemini 2.0 family — Flash goes generally available with higher rate limits, Flash-Lite arrives as the most cost-efficient option yet, and Pro Experimental targets complex coding tasks. Useful for understanding how model tiers are designed and what "generally available" signals about production readiness versus experimental releases.
MIT Sloan's summary of the landmark BCG/Harvard field experiment on AI and knowledge worker productivity. Consultants using AI finished 12% more tasks, 25% faster, and produced 40% higher quality output. The nuance — AI helped most on tasks inside its capability frontier, and hurt performance on tasks outside it. Practical implications for how to decide which work to delegate to AI.
Brookings maps where EU and US AI regulation is converging — transparency requirements, risk tiering, liability frameworks — and where it's diverging, particularly on enforcement mechanisms and foundational model obligations. Essential reading for any organization operating across both jurisdictions as compliance timelines approach.
A thorough technical explainer on Mixture of Experts (MoE) architectures — the design powering Mixtral, DeepSeek, and Gemini 1.5. Covers how sparse routing enables larger parameter counts without proportional compute costs, load balancing challenges, and why MoE models behave differently from dense models at inference time. Assumes comfort with transformer fundamentals.
#moe#architecture#llm
Going Deeper
Optional reads for those who want more. (Some may be behind a paywall)