In the following discussion we use
- LEAN represented in Python: hand-written examples
- Automation rules: what can be a human-readable, machine interpretable, automated way of constructing such representation?
Main difficulties:
- Functional Language vs Procedural Language
- Annotations/Comments for readability
- How to preserve faithfulness?
- Specialized tokenizer for math language / change of LEAN writing style?
- Eval: is there a way to crosscheck / calculate relatedness?
- Breaks down: do for entire input seq, or first separate each component?
- Is it even representable?
-
$N \rightarrow PR$ : faithful and sound mapping: utilize code-LLM -
$PR \rightarrow L$ : utilize IR and one-on-one correspondence(not necessarily LLM).
Main difficulties:
- faithful and sound: hard to restrict, how to restrict?
- co-design of IR and transformation
- LLM guided IR
- best-case-scenario: Python interpreter of LEAN
- bad writing habit of LEAN library
- Human-in-the-loop update for
idea:
- Extract LEAN compiler output after formalization
- preprocessor: interpret the error message / informalization?
- preprocessor': goal extraction
- preprocessor: RAG-like search for similar examples
- feedback strategy: CoT-like prompting? Direct modification? Code-refactor-like prompting?
- update strategy: global feedback-update vs local feedback-update?
Specify related work and reproduce their result. Executable script/steps should be put in /reproduce.
-
Create issue / comment existing issue / assign the issue to yourself before attempting a listed task.
-
Create a new branch with a descriptive name for the task.
-
Files should be well commented and documented.
-
Leave the finishing date when marking a task as done.