Research Reports | Igor Rivin

ArXiv Math Semantic Search

January 12, 2026 • Tool Announcement

AI-powered semantic search and Q&A over 700,000+ arXiv mathematics papers. Ask natural language questions, search by concept, explore author connections. Free MVP available at igor.ngrok.pro.

Live MVP Free Semantic Search RAG

Polya-Szego + Aristotle: From 2.8% to 100% Verified Proofs

January 12, 2026 • Research Update

Follow-up to our initial benchmark: using Aristotle to automatically verify LLM-generated Lean proofs. 80/80 submissions verified successfully (100% success rate). Aristotle corrects formalization errors and synthesizes complete, machine-checkable proofs. Data release: github.com/igorrivin/polya-szego-lean

Aristotle 100% Verified Lean4 Automated Proving

Polya-Szego Evaluation: LLM Reasoning vs Formal Verification

January 2, 2026 • Research Report

Comprehensive evaluation of frontier LLMs on 320 problems from Polya-Szego's "Problems and Theorems in Analysis I" (1925). Key finding: 95.4% informal reasoning accuracy but only 2.8% complete Lean4 proofs. The 30x gap reveals that formal verification, not understanding, is the bottleneck in AI mathematics.

Lean4 DeepSeek-Prover Mathematical AI Formal Verification Benchmarking