Artificial Intelligence for rare disease diagnosis.

Assessing the probability of development of further diseases in Gaucher disease patients.

The Spanish Foundation for the Study and Treatment of Gaucher Disease and other Lysosomal Diseases (FEETEG) promotes the scientific research of Gaucher disease and its treatment methods. The Foundation is interested in predicting the probability of development of diseases such as neoplasms or Parkinson’s disease in patients of Gaucher disease (correlations between diseases). For this purpose, Kampal Data Solutions was contacted by FEETEG to develop an advanced analytical model based on Artificial Intelligence with the information available in the Gaucher Spanish Disease Registry.


Due to the fact that Gaucher disease is a rare disease with few national registries, the computational power of a local computer for the study of correlations with other diseases was enough to analyse the data collected.

The challenge now is to generate a new model able to predict if a person has the probability of developing Gaucher disease. In this case, the AI model must include not only data from current Gaucher disease patients but also data from healthy patients. Opening our sample universe also to healthy patients exponentially increases the sample size (from hundreds to millions) and potentially the model’s complexity. This implies the need of advanced computational resources such as the cloud platform provided by EOSC.

Although this proof of concept is focused in Gaucher disease, the developed solution could be adapted in the future to other diseases data bases. The obtained general-purpose solution will be exploited by Kampal Data Solutions in the mid-term.

Work Plan

In the context of the EOSC-hub project, Kampal Data Solutions will develop the following tasks:

  • Import the healthy and ill patients’ registries to a data base on EOSC infrastructure.
  • Statistically analyse the data and develop a classifying model based on machine learning techniques.
  • Optimize the machine learning algorithm for a cloud based environment.
  • Validate the model performance and produce plots /charts of diverse KPIs.