End-to-End Test-Time Training: Making Long Context Work Without the Memory Tax2026-01-15T11:00:00Z•16 min read#llm#long context#test-time training#machine learning#transformers#inference optimizationHow TTT-E2E achieves constant inference latency regardless of context length by treating long context as a learning problem rather than an architecture problem.
Why Your LLM Only Uses 10-20% of Its Context Window (And How TITANS Fixes It)2025-12-08•15 min read#ai#machine learning#transformers#memory architectures#long context#titans#miras#neural networksGPT-4's 128K context window? It only uses about 10% effectively. Google's TITANS architecture introduces test-time memory learning that outperforms GPT-4 on long-context tasks with 70x fewer parameters.