Decentralized Assessment of FAIR datasets.
Assessing the quality and reliability of FAIR datasets through crowdsourcing of expertise.
The pilot is based on DEIP’s own developed deep-tech innovation – Decentralized Assessment System (DAS). DAS is a peer review system that uses an incentive model with reputation rewards and produces a quantifiable metric about the quality and reliability of any data set(s) being assessed. DAS is designed specifically for assessment of assets in expertise-intensive areas, such as scientific research. DAS introduces a comprehensive and robust assessment model:
- it sources the consensus about the quality of data sets among the domain experts through continuous two-level peer-review;
- it ensures fair rewards for contributions and curation efforts;
- it formalizes the result of assessment into explicit metrics/indicators useful for non-experts.
A core element of the assessment system is continuous two-level peer review. This novel peer-review model addresses the weaknesses of existing peer-review and enables more accurate curation of data sets and assesses its quality and reliability. The peer-review process includes two levels of assessment: a review on the 1st level; review curation (support) on the 2nd level.
In expertise-intensive domains, such as scientific research, the researcher’s reputation is what matters most. DAS introduces reputational rewards to promote unbiased and a good quality review.
The incentives system is built upon the following assumptions:
1) The scientific community will eventually reach a consensus about the quality and reliability of any data sets;
2) pioneer supporters should get more rewards and recognition.
With these assumptions, the incentives system is designed in such a way to incentivize researchers to perform a comprehensive, quality and unbiased review that shows the actual strengths and weaknesses of knowledge and technology being reviewed; and to be a pioneer in assessing knowledge and technology that was not yet assessed yet.
The pilot use case of the DAS will provide a potential step towards implementing a peer review infrastructure for FAIR data sets. The expected benefit of this pilot is aligned with the EOSC Interest Groups on Researcher Engagement and Use Cases. We will implement and launch a collaborative platform for reviewing FAIR data sets, which uses DAS for judging the reliability of data sets and enabling a novel incentive system for review. The DAS will test reputation-based incentives for reviewers, how it affects the quality and objectivity of reviews, and how efficiently it prevents various review gaming techniques. We will work closely with a thematic EOSC community with FAIRified data sets (e.g. ENVRI FAIR) and engage researchers from this community in the testing.
The first step is to survey research communities to discover what criteria is important when assessing data sets reliability, and what is the process of assessment within these communities. The results of the survey will be used to adjust and configure the system for the pilot.
The next step is to implement customization and deploy the DAS platform. This includes deployment and configuration of the DAS, development of a separate portal for data assessment, and running agent-based simulations to determine optimal parameters of the DAS model depending on the population of selected discipline(s) and the number of participating projects.
The following step is to onboard researchers to participate in pilot data creation and assessment processes. This includes reaching scientific communities, education, and training on how to use the DAS (via webinars). We are planning to work closely with communities that already have FAIRified data sets (e.g. ENVRI FAIR).
Finally, we will perform a number of activities to make adjustments to the system to help its adoption and dissemination. Such activities include an analysis of the efficiency, future possibilities, and scalability of DAS applicable to FAIR data assessment, the efficiency of mitigation of different gaming techniques, and creating a visualization of metrics of activity within the system during the pilot. Our team will also analyze the results of the assessment, discuss with scientific communities what additional metadata may be useful, and determine the ways to make a sustainable process of extracting this information and enriching metadata in accordance with the domain-specific community standards.