← Back to Home

Research Reports

Technical reports on AI, mathematics, and formal verification

ArXiv Math Semantic Search

January 12, 2026 • Tool Announcement

AI-powered semantic search and Q&A over 700,000+ arXiv mathematics papers. Ask natural language questions, search by concept, explore author connections. Free MVP available at igor.ngrok.pro.

Live MVP Free Semantic Search RAG

Polya-Szego Evaluation: LLM Reasoning vs Formal Verification

January 2, 2026 • Research Report

Comprehensive evaluation of frontier LLMs on 320 problems from Polya-Szego's "Problems and Theorems in Analysis I" (1925). Key finding: 95.4% informal reasoning accuracy but only 2.8% complete Lean4 proofs. The 30x gap reveals that formal verification, not understanding, is the bottleneck in AI mathematics.

Lean4 DeepSeek-Prover Mathematical AI Formal Verification Benchmarking