All Posts
73 posts about AI, language models, and software engineering
Making Science Machine-Readable: The Epistemological Challenge of Verifying Knowledge at Scale
How do you verify scientific knowledge when there are 2.9 million papers on arXiv alone, with thousands more added every day? A new paper extracts nearly two million claims from 16,087 manuscripts and compares machine evaluation to human peer review, with 81% agreement.
When AI Writes the Code, Verification Becomes the Job
Over 80% of developers now use AI assistants for code generation, yet at least 62% of AI-generated code contains vulnerabilities. As AI writes code faster than humans can review it, the engineer's primary job shifts from writing code to verifying it through formal methods.
The Historical Accident That Split Drug Design in Two (And the Contrastive Model That Reunites It)
Structure-based and ligand-based drug design evolved as separate fields solving the same problem. ConGLUDe, a contrastive geometric learning model, unifies both approaches and outperforms specialist methods on realistic benchmarks without requiring pre-defined binding pockets.
How AlphaGenome Models Gene Regulation: 2D Embeddings, Splicing, and the Race to Read Non-Coding DNA
A technical look at AlphaGenome's architecture, its 2D pairwise embeddings for splicing prediction, and what the model means for clinical variant interpretation.
EDEN: 28 Billion Parameters for Programming Biology
Basecamp Research's EDEN model trains on proprietary environmental metagenomics to design gene-insertion enzymes, antimicrobial peptides, and synthetic microbiomes -- all validated in the wet lab.
A Bioinformatician's Guide to Choosing Genomic Foundation Models
A practical guide to selecting genomic foundation models for bioinformatics tasks. Covers ESM-2, DNABERT-2, HyenaDNA, Nucleotide Transformer, scGPT, and Evo with specific recommendations for DNA sequence analysis, protein structure prediction, and single-cell analysis based on hardware requirements, inference speed, and task type.
When 62 Days of Compute Becomes 3: Diffusion Models as Fast Surrogates for Agent-Based Biological Simulations
How generative diffusion models can serve as fast surrogates for expensive biological simulations, achieving 22x speedup while preserving the stochastic diversity that makes these models scientifically useful.
When the Algorithm Can't Explain Itself: ML Interpretability in Precision Oncology
Machine learning models now outperform FDA-approved biomarkers in predicting treatment response, but the best-performing models often resist explanation. Here's how precision oncology is navigating the trade-off between performance and interpretability.
Project Silicon: What If We Could Do Gradient Descent on Assembly Code?
A deep dive into Project Silicon's proposal to build differentiable CPU simulators, enabling gradient-based optimization of assembly code and opening a new frontier in neural algorithm synthesis.
Biological World Models: The Projects You're Not Building (But Should Be)
Why computational biologists should stop building embeddings and start building simulators, with three tractable project ideas you can implement today using flow matching, Neural ODEs, and cell fate trajectory modeling.
Benchmarks vs RL Environments: Why the Distinction Actually Matters
Understanding when you're working with an environment versus a benchmark changes how you design experiments, interpret results, and communicate findings. This guide covers the practical differences every RL practitioner should know.
Why 1000-Layer Networks Finally Work for Reinforcement Learning
Recent research shows 1024-layer networks achieve 2x to 50x improvements in goal-conditioned RL. Here's why extreme depth works now, and when you should consider it for your own agents.
DiscoRL: When Algorithms Learn to Design Algorithms
DeepMind's DiscoRL discovers reinforcement learning algorithms that outperform hand-designed methods like PPO and DQN. By treating algorithm design as a meta-learning problem, it found alternatives to value functions and bootstrapping through optimization alone.
Do LLMs Construct World Models? A Cognitive Science Investigation
Are large language models merely stochastic parrots, or do they develop genuine internal representations of the world? This investigation examines evidence from Othello-GPT, spatial encoding in LLMs, and the symbol grounding problem to explore what cognitive science reveals about AI understanding.
Tensor Logic: One Equation to Rule Them All
Pedro Domingos proposes that neural networks and symbolic AI are the same mathematical operation - a logical rule can be equivalently written as a tensor equation in Einstein summation notation. If true, we've been building separate tools for problems that share identical structure.
When Machines Design Their Own Learning Algorithms
A machine trained on simple grid worlds beat every hand-designed RL algorithm on Atari. DeepMind's DiscoRL discovers algorithms through meta-learning that outperform DQN, PPO, and A3C - methods humans spent decades developing.
Why Your LLM Only Uses 10-20% of Its Context Window (And How TITANS Fixes It)
GPT-4's 128K context window? It only uses about 10% effectively. Google's TITANS architecture introduces test-time memory learning that outperforms GPT-4 on long-context tasks with 70x fewer parameters.
Biology's Secret Weapon: Physics-Based Benchmarks for Training RL Agents
Why biological systems offer the ideal training ground for reinforcement learning: automated verification through physics, not human judgment. From protein design with AlphaFold to RNA folding with ViennaRNA, biology provides the verifiable inverse problems that RL needs at scale.
What Are World Models? The AI Architecture That Learns to Dream
World models enable AI agents to imagine futures and plan actions, achieving 10-100x better sample efficiency than traditional reinforcement learning. From DreamerV3 collecting diamonds in Minecraft to foundation models like Sora and Genie, world models represent AI's shift from pattern matching to simulating reality itself.
AI Designed Two New Antibiotics From Scratch. Here's Why That Changes Everything
MIT researchers used generative AI to create novel antibiotics from scratch, not finding them, but designing them. The breakthrough matters less for what it creates than for what it proves.
Nested Learning: How Your Neural Network Already Learns at Multiple Timescales
Nested Learning: The Illusion of Deep Learning Architectures - A comprehensive guide to the arXiv paper revealing how neural networks learn at multiple timescales through hierarchical optimization.
Kosmos: What a 12-Hour AI Research Session Actually Produces
An AI research system maintains coherent reasoning across 200+ agent steps over 12 hours, generating 42,000 lines of code while reviewing 1,500 papers. Kosmos achieves 79.4% accuracy through structured world models and parallel agents, but verification remains humanity's bottleneck.
Why Foundation Models in Pathology Are Failing: And What Comes Next
Why does it feel like our tools weren't designed by pathologists? Billions poured into AI models that compress whole slide images into tiny vectors, ignoring how pathologists actually examine tissue. The evidence reveals why scaling won't fix this disconnect.
How an AI System Independently Discovered a New Bacterial Survival Strategy (And Got It Right)
An AI system at Google DeepMind discovered how bacteria share genes across species barriers using 7 days of computational reasoning. When tested, it matched unpublished experimental observations exactly.
Foundation Models Are Rewriting the Rules of Biology
Foundation models trained on biological data are transforming protein structure prediction, genomics, drug discovery, and pathology. Learn how machine learning benchmarks in 2024 are revealing biology's dark matter through RNA analysis and metagenomic discovery.
Anthropic's Entry Into Life Sciences: A Platform Play, Not Just a Model
Why does it feel like our tools weren't designed by pathologists or researchers? Anthropic's Claude for Life Sciences attempts to bridge this gap by embedding AI directly into existing scientific workflows through Connectors and Agent Skills, building an operating system for R&D that actually respects how scientists work.
From Structure to Function: Leveraging AlphaFold's Evoformer Embeddings for Downstream AI
AlphaFold solved protein folding, but its hidden embeddings may be even more valuable, powering everything from drug design to disease prediction.
The Geometry of Grammar: An Investigation into Interpretable Dimensions in Word Embeddings
Does a simple 'tense axis' exist in word embeddings? Explore the geometric reality of grammatical structure in AI language models, from Word2Vec's elegant linear analogies to BERT's complex contextual transformations.
Teaching AI to Keep Buildings Standing: Reinforcement Learning and Physics-Informed Design
Exploring how Reinforcement Learning (RL) combined with Physics-Informed Machine Learning (PIML) can teach AI to design structurally sound and resilient buildings by learning from simulated physical environments.