Skip to content

abonfiglio73/zoteye

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ZotEye

ZotEye

Plagiarism detection for Zotero reference libraries
Report Bug · Request Feature

Python3.11-3.14 Release License Live Site

Table of Contents

About The Project

ZotEye compares a document with the documents in the local Zotero reference library, automatically identifying similar text and overlaps. It helps authors avoid article rejections due to potential plagiarism and assists editors in verifying the originality of submitted manuscripts. The ultimate goal is to enhance transparency and reliability in scientific research.

Note

All similarity analyses are performed locally. No texts, articles, plagiarism results, or reports are uploaded online.

ZotEye

(back to top)

Built With

Python

Main Features

  • 🚀 Graphical interface
  • 🌎 Multilingual (EN, IT)
  • 📚 PDF text extraction
  • 🔍 Automatic similarity analysis between a document and the local Zotero reference library
  • 🔪 Possibility of excluding sections (e.g., "references") and quoted sentences from analysis
  • 🧠 NLP based on n‑gram (words)
  • 🔢 Calculation of similarity percentages and per document
  • 🧾 Similarity report
  • 💾 Local cache to speed up subsequent analyses
  • 📅 Automatic updates

(back to top)

Getting Started

Project Structure

The repository structure can be found here: tree.md

(back to top)

Prerequisites

  • Windows 11 (should also work on Windows 10)
  • Python 3.11–3.14 (Python 3.14.2-amd64 was used to build this app; higher versions were not tested)
  • Required Python libraries (see requirements.txt)
  • NSIS installer (version 3.11 with INetC plug-in was used for distributing this app when building from source code)

(back to top)

Installation

Latest Release

If you are only interested in the executable, go to releases, download, and install the latest version (ZotEye_Installer.exe).
In this case, if Python 3.11-3.14 is not found on the system, Python 3.14.2 (64-bit) and the necessary libraries will be downloaded and installed by the installer.

Warning

ZotEye is a free, open-source application. However, some browsers or security systems may show warnings when downloading or opening ZotEye_Installer.exe:

  • In MS Edge, you might see: "ZotEye_Installer.exe is not commonly downloaded. Make sure it is safe before opening it." Click the three dots in the top-right corner and select Keep, then click the three dots in the bottom-right corner of the next window and select Keep Anyway.
  • Tools such as Microsoft Defender SmartScreen may also block the installer. Right-click the file, choose Properties, click Unblock, and then OK. Once unblocked, the installer can be run normally.

Source Code

If you want to view and work on the source code:

  1. Download and install Python 3.11-3.14 from https://www.python.org/downloads/ (if not already present; during installation, select the option "Add python.exe to PATH" for quick access to Python commands in the command prompt or PowerShell.)
  2. Clone the repository
    git clone https://github.com/abonfiglio73/zoteye.git
  3. Create a Python virtual environment inside the zoteye folder
    cd zoteye
    python -m venv venv
  4. Activate the virtual environment
    .\venv\Scripts\activate
  5. Install dependencies
    pip install -r requirements.txt

(back to top)

Execution

Latest Release

After installation, open the app ZotEye.

Source Code

To run the application from the source code, use:

python .\main.py

(back to top)

Distribution

To distribute the application from the source code as an executable (.exe), run the following script:

.\install\build_installer.bat

Make sure that NSIS is installed and that NSIS path inside the script build_installer.bat is correct:

NSIS_PATH=C:\Program Files (x86)\NSIS\makensis.exe

The script will create the installer file ZotEye_Installer.exe inside the build folder.

(back to top)

Usage

Comparison Modes

Three comparison modes are available:

  1. target document → single document
  2. target document → documents in a folder
  3. target document → local Zotero reference library

The comparison is based on n‑gram (word) matching.
The target document can be PDF or DOCX; if DOCX, it will be converted to PDF.
Documents to compare must be PDFs.
In the third mode you need to specify the path to the Zotero database (zotero.sqlite). The path can be found manually or via an automatic search.
Once specified, the collections present in the reference library will be shown. The user may choose one or more collections.

Note

Choosing the n-gram size

ZotEye uses 4-word n-grams by default for text similarity in scientific papers. Different sizes can be chosen:

  • Small (1–2 words): detects short overlaps, may give false positives.
  • Medium (3–5 words): good balance, recommended for general use.
  • Large (6+ words): finds exact matches, best for detecting significant reuse.

Start with the default 4-word n-grams; adjust if you want more sensitivity (smaller) or specificity (larger).

Local cache

To speed up subsequent analyses, ZotEye saves n‑grams locally in a database by default. The saved n‑grams depend on the selected options (n‑gram size, excluded sections and quoted sentences). Users can modify this behavior.
Database management can be done in two ways:

  • Incremental (default): All databases with n‑grams saved using the active options are loaded. Only n‑grams from new documents that are not already in the databases are stored.
  • Non‑incremental: A single database (if it exists) is loaded with the same set of options and contains the n‑grams for the same documents. N‑grams from documents already present in other databases may still be saved if they belong to a group of documents not yet indexed.

Tip

A non‑incremental approach is useful when you want to limit similarity analysis to specific folders or collections.

Report

The generated report (PDF version):

  • 📑 Highlights the n-grams found in the compared documents

  • 🖌️ Uses different colors to distinguish documents

  • 🔗 Includes a link to the document (and page) within the text

  • 🔢 Shows overall similarity percentage and cumulative similarity percentages per paragraph

  • ✍️ Lists sentences with higher similarity sorted by number of n-grams

  • 🔖 Lists documents sorted by similarity percentage

    ZotEye ZotEye

Important

Overall Similarity Percentage Interpretation
<10% Generally acceptable
10–25% Requires review, especially if concentrated in a few sections and/or longer sentences
>25–30% Almost always considered problematic

(back to top)

Roadmap

Possible future developments:

  • 🎯 Compatibility with other operating systems (e.g., Linux)
  • 🎯 Graphical highlighting of similar sentences in the original DOCX
  • 🎯 Similarity calculation via:
    • Embeddings with local LLM models
    • Cosine similarity

(back to top)

Contributing

If you have a suggestion improving this app, you can:

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feat/myfeature)
  3. Commit your Changes (git commit -m "Add myfeature")
  4. Push to the Branch (git push origin feat/myfeature)
  5. Open a Pull Request

Alternatively, you can simply request a feature or report a bug.

(back to top)

License

Distributed under the MIT license.

(back to top)

Author

Andrea Bonfiglio, © Copyright 2026

(back to top)