Skip to content

obrad1984/dspace-v510-metadata-simplifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dspace-v510-metadata-simplifier

(gitHub repository)

DSpace Repository Metadata Entry Tools

Intro

This Python script is designed to simplify and streamline metadata entry for DSpace v5.10 repositories. Manual metadata entry in DSpace can be complex and error-prone; this tool automates common tasks[...]

Although DSpace v5.10 is now deprecated, many institutions still rely on it for their digital collections. This script is dedicated solely to DSpace v5.10, addressing its unique requirements and makin[...]

We invite the community to fork this code, use it for your own needs, and contribute improvements – together, we can make metadata entry for DSpace v5.10 repositories effortless and efficient!

Description

This Python script is designed to facilitate the preparation of bibliographic metadata for import into the institutional repository by transforming and enriching data from a CSV file exported from Sco[...]

  1. User Interactivity via GUI: The script uses a graphical file dialog (via Tkinter) to let the user select the input CSV file (containing publication metadata, including DOIs) and specify the output file location for the processed[...]

  2. Duplicate Checking: For each row in the input CSV, the script checks if the DOI (Digital Object Identifier) already exists in the VinaR repository by sending a POST request to the VinaR REST API. If the DOI already exist[...]

  3. Metadata Enrichment: For every DOI that is not a duplicate, the script queries the Scopus API (via Elsevier’s REST API) to retrieve additional metadata, such as authors, page numbers, publication date, abstract, ISSN/IS[...]

  4. Data Transformation and Output: The retrieved and processed metadata is formatted according to the requirements of the VinaR repository and written as a new row in the output CSV file. The output CSV includes specific headers needed[...]

  5. Error Handling and Logging: The script prints informative messages to the console, such as when DOIs are found to be duplicates, when rows are skipped, or if there are network/API errors.

.env.example File

The .env.example file contains template environment variable definitions required to run the script, such as API keys and endpoint URLs.
Do not store real credentials in .env.example. Instead, copy .env.example to a new file named .env and fill in your actual credentials and settings there.
The script automatically loads variables from your .env file to securely provide access to APIs and services.

Instructions:

  1. Copy .env.example to a new file named .env in the root project directory.
  2. Edit the .env file and enter your API keys, access tokens, and other configuration details as indicated.
  3. Keep your .env file private and do not commit it to version control (e.g., GitHub).
  4. The script will use the variables from your .env file during execution.

Approved

The script was tested and approved on VinaR, the institutional repository of the VINCA Institute of Nuclear Sciences - University of Belgrade.

Copyright 2025 Obrad Vučkovac

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

About

This Python script enriches Scopus-exported CSV, skips duplicates via DSpace API, fetches additional data from Scopus API, formats it for repository import, and logs actions—streamlining accurate, duplicate-free metadata preparation through a user-friendly GUI and automated workflow.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages