Skip to content

sommergeo/roceeh2wiki

Repository files navigation

roceeh2wiki

This repo contains tools to publish geodata from the ROCEEH Out of Africa Database (ROAD) to Wikipedia maps. The tools help to query data from ROAD's R interface and convert the results to the JSON schema of Wikipedia's map extension Kartographer. The JSON files can be uploaded to Wikimedia Commons and referenced within Wikipedia.

Interactive maps derived from ROAD are reviewed by colleagues with scientific expertise and, once approved, published on Wikipedia. The criterion here is not absolute completeness, as the ROAD database and the state of scientific knowledge are constantly changing, but rather a “representative picture” of the current state of research. The data is updated regularly.

Workflow of the roceeh2wiki package

Results

The following Wikis are currently provided:

ROAD Content Wikimedia Wikipedia Status
Early Stone Age Link en de fr it es pt
Middle Stone Age Link en de fr. it es pt
Later Stone Age Link en de. fr it es pt
Lower Paleolithic Link en de fr it es pt
Middle Paleolithic Link en de fr it es pt
Upper Paleolithic Link en de fr it es pt
Acheulean Link de fr it es pt
Ahmarian Link
Aterian Link en de fr it es pt
Aurignacian Link en de fr it es pt
Chatelperronian Link en de fr it es pt
Early Upper Paleolithic Link en de fr it es pt not on Wikipedia
Fauresmith Link en de fr it es pt
Gravettian Link en de fr it es pt
Howiesonspoort Link en de fr it es pt
Initial Upper Paleolithic Link
Levantine Aurignacian Link
Micoquian Link en de fr it es pt
Mousterian Link en de fr it es pt
Proto-Aurignacian Link en de fr it es pt not on Wikipedia
Solutrean Link en de fr it es pt
Still Bay Link en de fr it es pt
Uluzzian Link en de fr it es pt
Early fire use Link en
Ochre use Link en
Eyed needle use Link en en
Paranthropus Link en
Paranthropus aethiopicus Link In review
Paranthropus boisei Link In review
Paranthropus robustus Link In review
Sahelanthropus tchadensis Link
Ardipithecus ramidus and kadaba Link
Australopithecus afarensis Link
Australopithecus africanus Link
Homo rudolfensis Link
Homo habilis Link
Homo erectus Link
Homo ergaster Link
Homo heidelbergensis Link
Homo sapiens neanderthalensis Link
Homo sapiens Link

We would like to thank the expert reviewers who helped us by checking the representativeness of the data, suggesting improvements and deciding on publication. In alphabetical order: Abay Namen, Andrew Kandel, Armando Falcucci, Giulia Marciani, Guido Bataille, Liane Giemsch, Rimtautas Dapschauskas, Ron Shimelmitz, and others!

Use

There are three functions in roceeh2wiki.R to create geodata in a json scheme that is compatible with Kartographer, and one function to append the relevant metadata:

  • road_query_culture to retrieve cultures and technocomplexes from ROAD (e.g. 'Uluzzian')
  • road_query_period to obtain cultural periods from ROAD (e.g. 'Upper Paleolithic')
  • road_query_table to process data in R's native data.table format
  • wiki_json adds metadata compliant with the Kartographer scheme to the geodata
.
├── scripts
│   ├── roceeh2wiki.R 	    # Query data from ROAD and custom tables and export to JSON
│   └── roceeh2wiki.py		# deprecated
├── data
│   └── wiki_cultures.xlsx  # List with ROAD cultures and corresponding Wikis
├── input
│   ├── Homo sapiens.xlsx   # Custom list
│   └── ...
└── output
    ├── Acheulean.json		# Wiki-style json file
    ├── Aterian.json
    ├── Homo sapiens.json	
    └── ...		     		# Many more results

Background

Maps in Wikipedia

Web maps are implemented in Wikipedia using a <mapframe>' element. The element's text' argument is used as the subtitle of the map, and contains a name in the appropriate language, the license, and the source name. The mapframe points to a Wikimedia Commons file referenced by the "title" tag.

Screenshot from Wikimedia Commons

<mapframe text="Selected Uluzzian sites from the [https://www.roceeh.uni-tuebingen.de/roadweb ROAD database](CC BY-SA 4.0 ROCEEH)" width="450", height="350">
{
  "type": "ExternalData",
  "service": "page",
  "title": "ROCEEH/Uluzzian.map"
}
</mapframe>

Geodata in Wikimedia Commons

Geospatial data for Wikipedia is collected in Wikimedia Commons for two reasons. First, depending on the content, GeoJSON files can be too long to be readable in the Wikipedia text editor. Second, content in Wikimedia Commons can be accessed by wikis in any language, without the need for cross-posting.

Screenshot from Wikimedia Commons

URL

All ROAD content follows the URL scheme https://commons.wikimedia.org/wiki/Data:ROCEEH/*.map, where * is the content title. The resulting file is accessible within Wikipedia as ROCEEH/*.map.

JSON

The following code is an example from the Uluzzian culture, and has been truncated to show only one site, "Uluzzo C", for demonstration purposes. The cartographer schema uses a JSON file that can be split into general map information and geodata.

  • General map information:
    • The "license" for all ROAD data is CC BY-SA 4.0 and therefore complies with Wikipedia's terms of use.
    • The "description" is displayed as a subheading in Wikimedia Commons. Different languages may be used to translate to the title of the target wiki, e.g. English "Uluzzian" vs. German "Uluzzien".
    • The "sources" tag is a default text. The export date is updated automatically.
    • The "zoom", "latitude" and "longitude" tags are optional and can be used to set the initial extent of the map. However, the map engine is smart enough to set an appropriate extent automatically.
  • Geodata:
    • The "data" tag is the heart of Wikipedia's JSON schema and contains a standard GeoJSON file. Most of its content is standardized. The appearance of the popup is defined in the properties' of the features.
      • The "title" contains the name of the site as exported from ROAD. It is planned to optionally link to other wikis where available.
      • The "description" always contains a link to the site's Summary Data Sheet, a PDF generated with the URL https://www.roceeh.uni-tuebingen.de/roadweb/tcpdf/localityInfoPDF/localityInfoPDF.php?locality=*, where * is the site name. It is planned to optionally include existing Wikipedia images and other content where available.
{
    "license": "CC-BY-SA-4.0",
    "description": {
        "de": "Fundstellen des Uluzzien",
        "en": "Uluzzian sites"
    },
    "sources": "Data retrieved from the [https://www.roceeh.uni-tuebingen.de/roadweb ROCEEH Out Of Africa Database (ROAD)].",
    "zoom": 5,
    "latitude": 41.5,
    "longitude": 16.3,
    "data": {
        "type": "FeatureCollection",
        "name": "uluzzian_road",
        "crs": {
            "type": "name",
            "properties": {
                "name": "urn:ogc:def:crs:OGC:1.3:CRS84"
            }
        },
        "features": [
            {
                "type": "Feature",
                "properties": {
                    "title": "Uluzzo C",
                    "description": "[[File:Grotta di Uluzzo C 4.jpg|150px|alt=Grotta di Uluzzo C]]</br>[https://www.roceeh.uni-tuebingen.de/roadweb/tcpdf/localityInfoPDF/localityInfoPDF.php?locality=Uluzzo%20C Summary Data Sheet]"
                },
                "geometry": {
                    "type": "Point",
                    "coordinates": [
                        17.96,
                        40.15
                    ]
                }
            }
        ]
    }
}

ROAD SPARQL endpoint

The ROAD database is implemented as a relational SQL database that can be accessed through a web portal with many tools for querying, analysis and visualization. The database is also regularly exported to RDF files, which can be queried via ROAD's SPARQL endpoint at https://www.roceeh.uni-tuebingen.de/road/.

Screenshot from Wikimedia Commons

SPARQL queries can be submitted through a [web interface] (http://www.roceeh.uni-tuebingen.de/roadweb/smarty_sparql_select.php), which allows results to be exported to an HTML table, JSON, XML, or CSV file. The following example shows a query for archaeological sites associated with the Uluzzian culture, returning their names and geocoordinates. Roceeh2wiki uses the Python library sparql-dataframe to query data directly.

PREFIX road: <https://www.roceeh.uni-tuebingen.de/road/>
PREFIX wgs84_pos: <https://www.w3.org/2003/01/geo/wgs84_pos#>

SELECT  DISTINCT (?culture) ?title ?lon ?lat
WHERE {
  ?x a road:ArchaeologicalLayer.
  ?x road:ArchaeologicalLayer\#archstratigraphyIdArchstrat "Uluzzian".
  ?x road:ArchaeologicalLayer\#localityId ?title.
  ?y a road:Locality.
  ?y road:Locality\#id ?title.
  ?y wgs84_pos:long ?lon.
  ?y wgs84_pos:lat ?lat.
} ORDER BY ?title

About

Publish ROCEEH ROAD content to Wikipedia

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors