They say in the Lean Zulip thread that this is actually intentionally a "low quality" release (https://leanprover.zulipchat.com/#narrow/channel/583336-Auto...); the paper notes that the quality is "inferior to that of expert-written Lean code". Then again, "Our results suggest that formalizing the core textbook infrastructure of modern mathematics is within reach".
mccoyb 22 hours ago [-]
In summary, we present a system which actually doesn’t work because real experts looked through the thing our system produced and found that it was full of reward hacking, but of course we’re honest about this and we’ll adjust our metrics, and so we think it’s okay to just slop this out there because we’re honest and we will make promises to fix everything, and we also hope you like our attempt to gain notoriety for Meta AI.
ubercore 21 hours ago [-]
Didn't read the article, sorry, but this reminds me we _really_ need more cool acronyms back in our industry.
measurablefunc 23 hours ago [-]
> However, our system relies on frontier LLMs, whose compute costs still remain nontrivial. Moreover, each textbook was formalized in isolation, without the careful planning needed to make it maximally compatible with existing mathlib infrastructure. Human involvement currently remains necessary to handle such organisational concerns — selecting and ordering books in a dependency-aware manner, bridging mismatched conventions across sources, etc
Rendered at 17:22:34 GMT+0000 (Coordinated Universal Time) with Vercel.
They say in the Lean Zulip thread that this is actually intentionally a "low quality" release (https://leanprover.zulipchat.com/#narrow/channel/583336-Auto...); the paper notes that the quality is "inferior to that of expert-written Lean code". Then again, "Our results suggest that formalizing the core textbook infrastructure of modern mathematics is within reach".