-
Notifications
You must be signed in to change notification settings - Fork 0
RDF Data Cube
Note: I hope some of the content below may be useful as background information for Dan Brickley's Proposal for representing Aggregate Statistical Data (for Schema.org) tweet ; schema.org mailing list post GitHub Issue
2014 The RDF Data Cube Vocabulary W3C Recommendation 16 January 2014 http://www.w3.org/TR/vocab-data-cube/
Notes:
- some key concepts "inherited" from SDMX are not fully re-explained in the RDF Data Cube specification (e.g. the concept of a dimension) - the authors assumed that readers have a previous knowledge of "SDMX" as a recommended best practices (read the 2009 SDMX User guide and Guidelines for DSDs)
- relationship between QB and SCOVO / SDMX vocabularies used in the example - the reader should know that it is not required to learn the scovo bits (which are only here to explain how the working group got there): SCOVO is an earlier attempt to convert SDMX into a RDF vocabulary http://vocab.deri.ie/scovo
- the reader should also know that the use of SDMX vocabularies is optional
Discussion on W3C mailing list:
- Why this kind of Data Structure Definition http://lists.w3.org/Archives/Public/public-gld-comments/2012Aug/0001.html
Richard Cyganiak, Simon Field, Arofan Gregory, Wolfgang Halb, Jeni Tennison (2010) Semantic Statistics: Bringing Together SDMX and SCOVO http://ceur-ws.org/Vol-628/ldow2010_paper03.pdf and http://events.linkeddata.org/ldow2010/slides/ldow2010-slides-cyganiak.pdf via https://twitter.com/cygri/status/1143793438850277378
Hausenblas et al (2009) SCOVO: Using Statistics on the Web of Data https://mhausenblas.info/pubs/eswc09-inuse-scovo.pdf
Draft specification, SDMX vocabularies: https://github.com/UKGovLD/publishing-statistical-data content exported from code.google.com/p/publishing-statistical-data
Vrandecic et al (2010) Semantics of Governmental Statistics Data https://pdfs.semanticscholar.org/ef95/49c863db64c43fc4d12d71d37fe5b85f77a3.pdf
See also SDMX standards RDF Data Cube is derived from below and SDMX standards and ontologies page.
2013 Validation: requirements and approach: https://www.w3.org/2001/sw/wiki/images/9/92/RDFVal_Reynolds.pdf
Jena-based RDF Data Cube Validator (dependency to jena-libs 2.1.2) https://github.com/yyz1989/NoSPA-RDF-Data-Cube-Validator
No for the code but with some good examples provided: https://github.com/weso/Computex-RDF-Data-Cube-Validator (in the ABS environment, requires Scala Play 2.1) see also the attachments in https://lists.w3.org/Archives/Public/public-gld-comments/2013Jul/0001.html And the OpenCube project http://opencube-project.eu/ + http://opencube-toolkit.eu/ (and now the OpenGovIntelligence project) has developed lots of tools based on the RDF Data Cube specification https://github.com/opencube-toolkit: r2rm-qb aggegation https://github.com/OpenGovIntelligence json-qb-api-implementation
http://rdfvalidator-rdfvalidation.rhcloud.com/rdfvalidator/ related to (?) this GeoKnow report http://svn.aksw.org/projects/GeoKnow/Public/D4.6.1_Quality_assessment_services_for_ESTA-LD.pdf
2009 SDMX User guide https://sdmx.org/wp-content/uploads/sdmx-userguide-version2009-1-71.pdf Document introducing the concepts of data cube, dimensions/measures, DSD ...
Simon Field (2010) SDMX and the semantic web: implications for publishers of statistical data http://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.50/2010/wp.27.e.pdf a snapshot of the ideas behind RDF Data Cube at the very beginning of its development (only useful if interested in the linkage to what's in SDMX)
2013 SDMX guidelines for DSDs https://sdmx.org/wp-content/uploads/SDMX_Guidelines_for_DSDs_1.0.pdf
2016: Linked Data Cubes: research results so far http://semstats.org/2016/content/linked-data-cubes-research-results-so-far/article.pdf a good recap by OpenCube people
Epimorphics, the company of Dave Reynolds, one of the authors of the RDF Data Cube specification has shared lots of information on their UK Bathing Waters project http://environment.data.gov.uk/index.html
- Technical doc: http://environment.data.gov.uk/bwq/doc/api-reference.html best practice used as guidance for CSIRO's work on the Linked Sensor Data Cube / ACORN-SAT project
- Payment Data Cube https://data.gov.uk/resources/payments also by Epimorphics is another good introductory example the Ontology Hub view of the ontology is available: http://tsdcoderepository/repos/abs-repos/MDMD/EDM/trunk/ontologies/tutorial/payment/payment-doc.html )
A mix of XSLT-based transformations and bash scripts to access and transform SDMX 2.0 content hosted by .Stat Web Services.
- Linked SDMX Data http://csarven.ca/linked-sdmx-data and Statistical LOD cloud https://270a.info/
- Sarven Capadisli, Sören Auer, Reinhard Riedl Towards Linked Statistical Data Analysis: paper - http://ceur-ws.org/Vol-1549/article-06.pdf and slides - http://csarven.ca/presentations/lsd-analysis/ Best paper award at 1st Semantic Statistics workshop ISWC 2013, Sydney
- SWH paper Sarven Capadisli, Sören Auer, Axel-Cyrille Ngonga Ngomo Linked SDMX Data http://semantic-web-journal.net/content/linked-sdmx-data page with reviews
Multiple GitHub sub-repositories (first wave circa 2013, 2nd wave circa 2015)
- https://github.com/csarven/linked-sdmx .xsl and data (2013) The original 270a.info demo (.xsl and data)
- worldbank-linkeddata ; ecb-linked-data ; abs-linked-data; 270a.info.interlinks linksets created using LIMES; stats.270a.info ...
- Linked Statistical Data (LSD) Cube Designer; Semantic Similarity and Correlation of Linked Statistical Data Analysis; LSD-analysis R: Shiny/Shiny server
Example of documentation about the .Stat (SDMX) web services used by Csarven to access data
More recent documents from ".Stat community" (if you want to catchup with current status of .Stat stack)
- .Stat suite (new Beta open source - June2019) https://siscc.org/ @sisccommunity dotstatsuite-documentation
- Eurostat SDMX-RI web services ; sdmx-it-tools
- SDMX Technical Standards Working Group https://github.com/sdmx-twg
- OECD Digital Practices and Solutions https://github.com/cis-itn-oecd
- SDMX-JSON https://github.com/TerriaJS/terriajs/issues/1514 (TerriaJS = National Geospatial Data catalogue/map project)
- Lefort, L., Bobruk, J., Haller, A., Taylor, K. and Woolf, A., 2012, November. A linked sensor data cube for a 100 year homogenised daily temperature dataset. ISWC Semantic Sensor Network workshop (SSN 2012) paper: http://ceur-ws.org/Vol-904/paper10.pdf and slides: https://www.slideshare.net/140er/linked-sensor-data-cube Best paper award SSN 2012
- Lefort, L., Haller, A., Taylor, K., Squire, G., Taylor, P., Percival, D. and Woolf, A., 2017. The ACORN-SAT linked climate dataset. http://www.semantic-web-journal.net/content/acorn-sat-linked-climate-dataset-0 SWJ paper (reviews)
- https://data.gov.au/dataset/ds-dga-900143f6-6582-49c5-bfd4-0838901d99c8/details
Leroux, H. and Lefort, L., 2015. Semantic enrichment of longitudinal clinical study data using the CDISC standards and the semantic statistics vocabularies. Journal of biomedical semantics, 6(1), p.16. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-015-0012-6
Lefort, L. and Leroux, H., 2013. Design and generation of Linked Clinical Data Cubes. In SemStats@ ISWC http://ceur-ws.org/Vol-1549/article-05.pdf and slides https://www.slideshare.net/140er/semantic-stats2013
LATC: LOD Around-The-Clock: EU REserach project (2010-2012)
- Publishing EuroStat as Linked Data on the Web https://github.com/LATC/EU-data-cloud/tree/master/institutions/Eurostat
2016 PhUSE CS Semantic Technology Working Group RDF Data Cube Structure Technical Guidance September 2016 http://www.phusewiki.org/wiki/images/e/e0/ARM-CubeStructureTechSpec-V-1-0.pdf
- points to Csarven's URI patterns http://csarven.ca/linked-sdmx-data#uri-patterns
- R-RDF Data Cube for Clinical Research & Development https://github.com/phuse-org/rrdfqbcrnd rrdfqb RDF data cube R interface (depends on knitr)
Digital Economy and Society Index http://digital-agenda-data.eu/datasets/desi#download
Part I - The approach of DG CONNECT's digital-agenda-data repository and visualization tool https://circabc.europa.eu/sd/a/3c11e025-ed24-4627-9e87-a79ed8b82c9d/Session1_2_DG_CONNECT_Abruzzini-Melis_DAD_presentation_text.docx.
Data: http://semantic.digital-agenda-data.eu/dataset/DESI (via ELDA API http://semantic.digital-agenda-data.eu/meta/ ) https://github.com/digital-agenda-data/rdf
PWC report (also architecture) https://joinup.ec.europa.eu/sites/default/files/digital_agenda_data_tool_on_your_desktop_-_how_to.pdf Technical docs by Eau de web (206) http://digital-agenda-data.eu/documentation/technical-report-m6-2016 and http://digital-agenda-data.eu/documentation/deployment-manual
Evangelos Kalampokis, Bill Roberts, Areti Karamanou, Efthimios Tambouris, Konstantinos Tarabanis (2015) Challenges on Developing Tools for Exploiting Linked Open Data Cubes http://ceur-ws.org/Vol-1551/article-07.pdf and http://semstats.org/2015/content/challenges-on-developing-tools-for-exploiting-linked-open-data-cubes/slideshow.pdf
OpenCube project http://opencube-project.eu/ + http://opencube-toolkit.eu/ (and now the OpenGovIntelligence project) has developed lots of tools based on the RDF Data Cube specification
- https://github.com/opencube-toolkit: r2rm-qb aggegation
- https://github.com/OpenGovIntelligence json-qb-api-implementation
OpenGovIntelligence https://github.com/OpenGovIntelligence
Swirll https://github.com/swirrl
Developments on top of the RDF Data Cube vocabulary
- W3C Data Quality vocabulary https://www.w3.org/TR/vocab-dqv/
- QB4OLAP (see below)
- QB4ST https://w3c.github.io/sdw/qb4st/
Etcheverry and Vaisman (2012) QB4OLAP: a Vocabulary for Business Intelligence over the Semantic Web http://ceur-ws.org/Vol-905/EtcheverryAndVaisman_COLD2012.pdf
- Extends QB with levels and aggregate functions.
- QB4OLAP https://github.com/lorenae/qb4olap/wiki
Kämpgen and Harth (2011) Transforming Statistical Linked Data for Use in OLAP Systems http://www.aifb.kit.edu/images/2/28/Kaempgen_harth_isem11_olap.pdf I-SEMANTICS 2011
Evangelos Kalampokis, Bill Roberts, Areti Karamanou, Efthimios Tambouris, Konstantinos Tarabanis (2015) Challenges on Developing Tools for Exploiting Linked Open Data Cubes http://ceur-ws.org/Vol-1551/article-07.pdf and http://semstats.org/2015/content/challenges-on-developing-tools-for-exploiting-linked-open-data-cubes/slideshow.pdf
Implementation: https://github.com/OpenGovIntelligence/json-qb-api-implementation API-Spec (work in progress) https://github.com/OpenGovIntelligence/json-qb/blob/master/spec/old-api-spec.md Table format https://github.com/OpenGovIntelligence/json-qb/blob/master/spec/table-format.md
Corresponding report from OpenGovIntelligence project (November 2016) http://www.opengovintelligence.eu/downloads/deliverables/OGI_D3.2_Report_on_OpenGovIntelligence_ICT_tools_-_first_release_v1.0.pdf dependency on json-stat
JSON-Stat https://json-stat.org/format/
- https://github.com/badosa/JSON-stat
- http://bl.ocks.org/badosa/8c1d7f29b5a6f6ee03eb and https://www.slideshare.net/badosa/consuming-nordic-statbank-data-with-jsonstat
- rjstat https://cran.r-project.org/web/packages/rjstat/index.html
Noting some aspects of it are aligned with the Nanopublication Guidelines http://nanopub.org/guidelines/working_draft/
- Leigh Dodds work on an ONS JSON LD Context (containing qb:Dimension and qb:codeList) https://github.com/ldodds/ons-metadata-examples
"number of examples demonstrating how metadata about ONS datasets can be described using standard formats and vocabularies"