Skip to content

PDGGK/LLM-Research-Internship-2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 LLM Research Internship 2025

2025 AI算法实习:大规模Schema下的Text-to-SQL多表关联推理研究

Research Focus: Multi-table Join Reasoning for Text-to-SQL in Large-Scale Schemas


📋 Overview

This repository documents my internship journey as an AI Algorithm Intern (Dec 2024 - Present), focusing on cutting-edge research in Text-to-SQL with large-scale database schemas.

🎯 Research Direction

  • Problem: LLM accuracy drops from 86% → 5% when database tables exceed 100 (the "Scale Wall" problem)
  • Focus: Multi-table join reasoning, Schema Linking optimization
  • Methods: Graph algorithms, semantic retrieval enhancement, query rewriting

📁 Repository Structure

LLM-Research-Internship-2025/
├── README.md                                    # This file
├── 实习日志_第一周_12月2日至12月11日.md          # Weekly log (Chinese)
└── Internship_Log_Week1_Dec2_to_Dec11.md        # Weekly log (English)

📚 Papers Studied

Paper Venue Core Contribution
SteinerSQL arXiv 2509.19623 Schema Linking as Steiner Tree problem, 40.04% SOTA
LinkAlign arXiv 2503.18596 Multi-round semantic enhanced retrieval
UNJOIN arXiv 2505.18122 Schema simplification via virtual wide table
SchemaGraphSQL arXiv 2505.18363 Pathfinding graph algorithms for Schema Linking
CHESS arXiv 2405.16755 Multi-agent framework for Text-to-SQL
Multi-hop Reasoning arXiv 2405.09593 LLM-based multi-hop reasoning

🛠️ Technical Stack

  • LLM: Qwen3-14B-awq (Local Deployment)
  • Embedding: BGE-large-en-v1.5
  • Framework: LlamaIndex, sentence-transformers
  • Database: MySQL (344 tables, 3353 columns)

📝 Weekly Logs

Week 1 (Dec 2-11, 2024)

  • 📖 Paper survey on Text-to-SQL multi-table join reasoning
  • 🎤 Team presentation on research findings
  • 🔧 LinkAlign implementation and optimization
  • ✅ Query expansion for Chinese-English mixed retrieval

💡 Key Insights

"Academia and industry are different—papers look elegant, but the real challenges begin when you try to implement them."

  • Graph theory concepts (learned at university) are directly applicable in AI algorithm implementation
  • AI assistance makes paper reading and debugging much more accessible
  • The gap between academic benchmarks and real-world business scenarios requires creative engineering solutions

📞 Contact

Feel free to reach out for discussions on Text-to-SQL research or AI internship experiences!


Last Updated: December 2024

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors