Skip to content

OnlpLab/hebrew_coreference

Repository files navigation

Hebrew Coreference Resolution System

Python 3.8+ License: MIT

A comprehensive toolkit for Hebrew coreference resolution that integrates multiple components into a unified pipeline. This system provides mention detection, web-based annotation, neural coreference models, and LLM evaluation capabilities.

πŸš€ Features

Core Components

  • Mention Detection: Advanced NP chunking with multiple parser backends (Stanza, Trankit, Gold)
  • Web Annotation: Interactive web interface for coreference and NP-relation annotation
  • Neural Models: State-of-the-art neural coreference resolution (LingMess-Coref, WL-Coref)
  • LLM Evaluation: Comprehensive evaluation of Large Language Models for coreference tasks

Key Capabilities

  • βœ… Multi-Parser Support: Stanza, Trankit, and Gold standard parsing
  • βœ… Interactive Annotation: Web-based interface for manual annotation
  • βœ… Neural Training: End-to-end neural model training and evaluation
  • βœ… LLM Integration: Zero-shot and few-shot evaluation of LLMs
  • βœ… Comprehensive Evaluation: MUC, BΒ³, CEAF metrics
  • βœ… Production Ready: Clean CLI interface and modular architecture
  • βœ… Extensible: Easy to add new models and components

πŸ“‹ Table of Contents

πŸ› οΈ Installation

Prerequisites

  • Python 3.8+
  • Git

Setup

# Clone the repository
git clone https://github.com/shaked571/hebrew_coreference.git
cd hebrew_coreference

# Clone the data submodule
git submodule update --init --recursive

# Install dependencies
pip install -r requirements.txt

πŸš€ Quick Start

1. Mention Detection

# Run NP chunking with Stanza parser
python src/mention_detection/stanza_parser/stanza_chunker.py

# Run NP chunking with Trankit parser
python src/mention_detection/trankit_parser/trankit2spacy.py

2. Neural Coreference

# Train LingMess-Coref model
cd src/neural_models/neural_coref/src/lingmess-coref
python training.py

# Evaluate model
python eval.py

3. LLM Evaluation

# Compare LLM outputs
cd error_analysis/llm_comparison
python compare_multi_system_outputs.py --approaches raw gold_tokenized sota_tokenized

4. Web Annotation

# Start annotation server
cd src/annotation/tne_ui
python annotationServer.py

πŸ—οΈ System Architecture

hebrew_coreference/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ mention_detection/          # NP chunking and parsing
β”‚   β”œβ”€β”€ neural_models/             # Neural coreference models
β”‚   β”œβ”€β”€ llm_evaluation/            # LLM evaluation framework
β”‚   └── annotation/                # Web-based annotation tools
β”œβ”€β”€ error_analysis/                # Error analysis and comparison tools
β”œβ”€β”€ statistics/                    # Statistical analysis tools
β”œβ”€β”€ tests/                         # Test suite
β”œβ”€β”€ data/                          # Data submodule (separate repo)
└── scripts/                       # Utility scripts

πŸ“– Usage

Mention Detection

The system supports multiple parsing backends for robust mention detection:

  • Stanza: Stanford's NLP toolkit with Hebrew support
  • Trankit: Unified multilingual NLP toolkit
  • Gold: Manual annotation for high-quality reference

Neural Models

Two state-of-the-art neural architectures:

  • LingMess-Coref: End-to-end neural coreference resolution
  • WL-Coref: Span-based coreference with BERT encoding

LLM Evaluation

Comprehensive evaluation framework for Large Language Models:

  • Raw Tokenization: Direct LLM output evaluation
  • Gold Tokenization: Aligned with reference tokenization
  • SOTA Tokenization: State-of-the-art tokenization alignment

Web Annotation

Interactive annotation interface for:

  • Coreference annotation
  • NP-relation annotation
  • Mention boundary correction
  • Quality control and validation

βš™οΈ Configuration

Environment Variables

export PYTHONPATH="${PYTHONPATH}:$(pwd)/src"
export CUDA_VISIBLE_DEVICES=0  # For GPU training

Model Configuration

Models can be configured through YAML files in their respective directories:

  • src/neural_models/neural_coref/src/lingmess-coref/config.yaml
  • src/neural_models/neural_coref/src/wl-coref/config.toml

πŸ“š Examples

Basic NP Chunking

from src.mention_detection.np_chunker.chunker import NPChunker

chunker = NPChunker()
chunks = chunker.chunk_text("Χ”Χ˜Χ§Χ‘Χ˜ Χ‘Χ’Χ‘Χ¨Χ™Χͺ גם Χ©ΧžΧ•Χͺ גצם")
print(chunks)

Coreference Evaluation

from src.neural_models.neural_coref.src.lingmess_coref.metrics import CorefEvaluator

evaluator = CorefEvaluator()
scores = evaluator.evaluate(predictions, gold)
print(f"MUC: {scores['muc']}, BΒ³: {scores['b3']}, CEAF: {scores['ceaf']}")

LLM Comparison

from error_analysis.llm_comparison.compare_multi_system_outputs import MultiSystemComparisonRunner

runner = MultiSystemComparisonRunner()
results = runner.run_comprehensive_analysis()

πŸ§ͺ Testing

Run the comprehensive test suite:

# Run all tests
python -m pytest tests/

# Run specific test categories
python -m pytest tests/test_mention_detection.py
python -m pytest tests/test_neural_models.py
python -m pytest tests/test_llm_evaluation.py

πŸ“Š Error Analysis

The system includes comprehensive error analysis tools:

  • Cluster-level Analysis: Detailed breakdown of correct/missed/extra clusters
  • Multi-system Comparison: Compare different approaches side-by-side
  • Visualization: Generate charts and graphs for analysis
  • Statistical Testing: Significance testing for performance differences

🀝 Contributing

We welcome contributions! Please see our contributing guidelines:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Hebrew NLP community for language resources
  • Stanford NLP Group for Stanza
  • Microsoft Research for Trankit
  • Contributors to LingMess-Coref and WL-Coref

πŸ“ž Support

For questions and support:

  • Open an issue on GitHub
  • Check the documentation in each component directory
  • Review the error analysis outputs for troubleshooting

Note: This system is designed specifically for Hebrew text and includes Hebrew-specific optimizations and linguistic features.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors