Table of Contents
ZotEye compares a document with the documents in the local Zotero reference library, automatically identifying similar text and overlaps. It helps authors avoid article rejections due to potential plagiarism and assists editors in verifying the originality of submitted manuscripts. The ultimate goal is to enhance transparency and reliability in scientific research.
Note
All similarity analyses are performed locally. No texts, articles, plagiarism results, or reports are uploaded online.
- 🚀 Graphical interface
- 🌎 Multilingual (EN, IT)
- 📚 PDF text extraction
- 🔍 Automatic similarity analysis between a document and the local Zotero reference library
- 🔪 Possibility of excluding sections (e.g., "references") and quoted sentences from analysis
- 🧠 NLP based on n‑gram (words)
- 🔢 Calculation of similarity percentages and per document
- 🧾 Similarity report
- 💾 Local cache to speed up subsequent analyses
- 📅 Automatic updates
The repository structure can be found here:
tree.md
- Windows 11 (should also work on Windows 10)
- Python 3.11–3.14 (Python 3.14.2-amd64 was used to build this app; higher versions were not tested)
- Required Python libraries (see requirements.txt)
- NSIS installer (version 3.11 with INetC plug-in was used for distributing this app when building from source code)
If you are only interested in the executable, go to releases, download, and install the latest version (ZotEye_Installer.exe).
In this case, if Python 3.11-3.14 is not found on the system, Python 3.14.2 (64-bit) and the necessary libraries will be downloaded and installed by the installer.
Warning
ZotEye is a free, open-source application. However, some browsers or security systems may show warnings when downloading or opening ZotEye_Installer.exe:
- In MS Edge, you might see: "ZotEye_Installer.exe is not commonly downloaded. Make sure it is safe before opening it." Click the three dots in the top-right corner and select Keep, then click the three dots in the bottom-right corner of the next window and select Keep Anyway.
- Tools such as Microsoft Defender SmartScreen may also block the installer. Right-click the file, choose Properties, click Unblock, and then OK. Once unblocked, the installer can be run normally.
If you want to view and work on the source code:
- Download and install Python 3.11-3.14 from https://www.python.org/downloads/ (if not already present; during installation, select the option "Add python.exe to PATH" for quick access to Python commands in the command prompt or PowerShell.)
- Clone the repository
git clone https://github.com/abonfiglio73/zoteye.git
- Create a Python virtual environment inside the
zoteyefoldercd zoteye python -m venv venv - Activate the virtual environment
.\venv\Scripts\activate
- Install dependencies
pip install -r requirements.txt
After installation, open the app ZotEye.
To run the application from the source code, use:
python .\main.pyTo distribute the application from the source code as an executable (.exe), run the following script:
.\install\build_installer.batMake sure that NSIS is installed and that NSIS path inside the script build_installer.bat is correct:
NSIS_PATH=C:\Program Files (x86)\NSIS\makensis.exeThe script will create the installer file ZotEye_Installer.exe inside the build folder.
Three comparison modes are available:
- target document → single document
- target document → documents in a folder
- target document → local Zotero reference library
The comparison is based on n‑gram (word) matching.
The target document can be PDF or DOCX; if DOCX, it will be converted to PDF.
Documents to compare must be PDFs.
In the third mode you need to specify the path to the Zotero database (zotero.sqlite). The path can be found manually or via an automatic search.
Once specified, the collections present in the reference library will be shown. The user may choose one or more collections.
Note
Choosing the n-gram size
ZotEye uses 4-word n-grams by default for text similarity in scientific papers. Different sizes can be chosen:
- Small (1–2 words): detects short overlaps, may give false positives.
- Medium (3–5 words): good balance, recommended for general use.
- Large (6+ words): finds exact matches, best for detecting significant reuse.
Start with the default 4-word n-grams; adjust if you want more sensitivity (smaller) or specificity (larger).
To speed up subsequent analyses, ZotEye saves n‑grams locally in a database by default. The saved n‑grams depend on the selected options (n‑gram size, excluded sections and quoted sentences). Users can modify this behavior.
Database management can be done in two ways:
- Incremental (default): All databases with n‑grams saved using the active options are loaded. Only n‑grams from new documents that are not already in the databases are stored.
- Non‑incremental: A single database (if it exists) is loaded with the same set of options and contains the n‑grams for the same documents. N‑grams from documents already present in other databases may still be saved if they belong to a group of documents not yet indexed.
Tip
A non‑incremental approach is useful when you want to limit similarity analysis to specific folders or collections.
The generated report (PDF version):
-
📑 Highlights the n-grams found in the compared documents
-
🖌️ Uses different colors to distinguish documents
-
🔗 Includes a link to the document (and page) within the text
-
🔢 Shows overall similarity percentage and cumulative similarity percentages per paragraph
-
✍️ Lists sentences with higher similarity sorted by number of n-grams
-
🔖 Lists documents sorted by similarity percentage
Important
| Overall Similarity Percentage | Interpretation |
|---|---|
| <10% | Generally acceptable |
| 10–25% | Requires review, especially if concentrated in a few sections and/or longer sentences |
| >25–30% | Almost always considered problematic |
Possible future developments:
- 🎯 Compatibility with other operating systems (e.g., Linux)
- 🎯 Graphical highlighting of similar sentences in the original DOCX
- 🎯 Similarity calculation via:
- Embeddings with local LLM models
- Cosine similarity
If you have a suggestion improving this app, you can:
- Fork the Project
- Create your Feature Branch (
git checkout -b feat/myfeature) - Commit your Changes (
git commit -m "Add myfeature") - Push to the Branch (
git push origin feat/myfeature) - Open a Pull Request
Alternatively, you can simply request a feature or report a bug.
Distributed under the MIT license.
Andrea Bonfiglio, © Copyright 2026
