Strict-LLM: 2-step Autoformalization

Roadmap

In the following discussion we use $N$ to denote natural language space, $PR$ to denote the space of Python pseudo code for representing LEAN, and $L$ the space of LEAN statement.

Representation: constructing $PR$

LEAN represented in Python: hand-written examples
Automation rules: what can be a human-readable, machine interpretable, automated way of constructing such representation?

Main difficulties:

- Functional Language vs Procedural Language
- Annotations/Comments for readability
- How to preserve faithfulness?
- Specialized tokenizer for math language / change of LEAN writing style?
- Eval: is there a way to crosscheck / calculate relatedness?
- Breaks down: do for entire input seq, or first separate each component?
- Is it even representable?

Transformation: $N \rightarrow PR \rightarrow L$

$N \rightarrow PR$: faithful and sound mapping: utilize code-LLM
$PR \rightarrow L$: utilize IR and one-on-one correspondence(not necessarily LLM).

Main difficulties: - faithful and sound: hard to restrict, how to restrict? - co-design of IR and transformation - LLM guided IR - best-case-scenario: Python interpreter of LEAN - bad writing habit of LEAN library - Human-in-the-loop update for $N \rightarrow PR$ training - Accuracy of $PR \rightarrow L$

Feedback-Update

idea:

- Extract LEAN compiler output after formalization
- preprocessor: interpret the error message / informalization?
- preprocessor': goal extraction
- preprocessor: RAG-like search for similar examples
- feedback strategy: CoT-like prompting? Direct modification? Code-refactor-like prompting?
- update strategy: global feedback-update vs local feedback-update?

Related Work

Specify related work and reproduce their result. Executable script/steps should be put in /reproduce.

General Recommendations

Create issue / comment existing issue / assign the issue to yourself before attempting a listed task.
Create a new branch with a descriptive name for the task.
Files should be well commented and documented.
Leave the finishing date when marking a task as done.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
RAG		RAG
__pycache__		__pycache__
datasets		datasets
mathlib4 @ 24ecc2a		mathlib4 @ 24ecc2a
mathlib_runtime		mathlib_runtime
output		output
representation		representation
test		test
test_result/stepped_test_proof_trace		test_result/stepped_test_proof_trace
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
__init__.py		__init__.py
fl_minif2f		fl_minif2f
lean_cmd_exec.py		lean_cmd_exec.py
response.json		response.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Strict-LLM: 2-step Autoformalization

Roadmap

Representation: constructing $PR$

Transformation: $N \rightarrow PR \rightarrow L$

Feedback-Update

Related Work

General Recommendations

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Strict-LLM: 2-step Autoformalization

Roadmap

Representation: constructing $PR$

Transformation: $N \rightarrow PR \rightarrow L$

Feedback-Update

Related Work

General Recommendations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages