#LLM

6 posts tagged with "LLM"

End-to-End Test-Time Training: Making Long Context Work Without the Memory Tax

2026-01-15T11:00:00Z•16 min read

#llm #long context #test-time training #machine learning #transformers #inference optimization

How TTT-E2E achieves constant inference latency regardless of context length by treating long context as a learning problem rather than an architecture problem.

Code World Models: Teaching LLMs to Simulate Execution

2025-12-18T21:00:00Z•15 min read

#code generation #world models #llm #agentic ai #software engineering #meta ai

How Meta's Code World Model applies the "dreaming car" insight from robotics to software engineering, achieving 65.8% on SWE-bench Verified by training on execution traces rather than static code.

The Stability-Plasticity Dilemma: How Memory Architectures Are Solving Continual Learning

2025-12-09T10:00:00+00:00•18 min read

#continual-learning #catastrophic-forgetting #memory-layers #machine-learning #llm

A comparative analysis of approaches to catastrophic forgetting in language models, from parameter regularization to sparse memory architectures that reduce forgetting from 89% to just 11%.

Mixture of Experts: The Efficiency Trick Behind Modern AI

2025-12-08•7 min read

#machine learning #moe #efficiency #llm #architecture #mixtral #deepseek #neural networks

Mixtral uses 46.7B parameters but only activates 13B per token. This architectural trick called Mixture of Experts powers Gemini 1.5, DeepSeek V3, and more. Learn how MoE works, its hidden costs, and when to use it.

Skills vs Slash Commands: A Developer's Guide to Claude Code Agents and Tools

2024-10-26T20:52:18Z•9 min read

#claude-code #ai-agents #anthropic #developer-tools #llm #agentic-systems

Skills vs Slash Commands: understand the key differences, when to use each, and how they work together in Claude Code. Learn which abstraction fits your workflow with practical examples and decision frameworks.

Meta MCP: When AI Builds Its Own Bridge to the World

2024-07-08T12:54:48+01:00•3 min read

#ai #llm #software architecture #meta #agent #future tech

Discover Meta MCP: LLMs that generate both UI and backend servers to interact persistently beyond simple request-response cycles.