Validation and comparison with other technologies of the SAFAN-ISP technology for in silico profiling of small molecules and peptides.
About the Pilot
S.A.F.AN. BIOINFORMATICS is a small bioinformatic company based in Torino with the mission of reducing costs and increasing efficiency of bringing new drugs to market Italy. It was born in 2004 after a business plan competition organized by the Politecnico of Turin. Since then it collaborated with most of the italian pharmaceutical companies. In the last years we devolved our time to internal research in order to develop SAFAN-ISP, a new fragment based in-silico screening profiling technology. SAFAN-ISP works now on small molecules and on peptides. It belongs to the ligand based family of prediction technologies that are based on the assumption that “Chemical compounds with similar structures may have similar activities”.
It uses an in-house developed molecular similarity algorithm and predicts target affinity by selecting and scoring compounds from an affinity database. It has in silico times and costs but its reliability is similar to an experiment. Our customers can use SAFAN-ISP for: 1. Drug repositioning using the target: disease database 2. Side Effect prediction using target: side effect database 3. Target identification in phenotypic screening outputs. 4- Potential therapeutic indications for natural compounds. In drug repositioning projects rare disease are specifically highlighted. SAFAN-ISP technology can be complemented by the company’s structural bioinformatic know how in order to understand the molecular details of ligand binding. S.A.F.AN. BIOINFORMATICS has been part of the H2020 CaSR Biomedicine European Training Network and participated in the Newprot FP7 European project.
For the 193328 compounds analyzed 231206 experimental affinity data are available from BindingDB (https://www.bindingdb.org). SAFAN-ISP was able to predict 137722 compound:target interactions i.e. about 60%. As all similarity methods also SAFAN-ISP needs similar compounds to be able to obtain a reliable prediction. For 40% of the affinity data to be predicted there were no similar compounds in SAFAN-ISP affinity database.
For 2599 affinity data no prediction was obtained from the similarity of compounds was obtained and a more sofisticated method was applied to obtain prediction from molecular fragments. The root_mean_square_error for this dataset was 1.341. The table shows that the similarity of more than 90% of the sample analyzed is less than 0.7, thus datasets are really different. It is also very interesting that for similarity between 0.7 and 0.5 the root mean square error is close to 1, so even if the similarity is very low, the error on the affinity prediction is still within one order of magnitude from the experimental data available. How this the error relative to the predictions can be compared to the dispersion of the experimental data? For about 10% of the compound:target affinities BindingDB reports more than one value. In those cases it is possible to evaluate how different can be experimental results from different team/setup. The average difference is root_mean_square_error 0.4 and in about 10% of the cases it is more than 1. The following graph compares the dispersion of the experimental data with the errors of the predictions. (in cellC)
HOW DID S.A.F.A.N. BIOINFORMATICS GET INVOLVED WITH EOSC DIH?
We have been introduced to EOSC and EOSC DIH by our partner CINECA
WHAT SERVICES WERE USED?
We entered the project because we needed computational time for an extensive validation of our core product SAFAN-ISP. EOSC allowed us to perform it at CINECA computational resources (DICE)
ABOUT THE EXPERIENCE OF WORKING WITH EOSC DIH AND THE VALUE OF EOSC FOR THE PILOT:
Beyond the achievements obtained and mentioned above, very important to validate our technology participating to an EOSC DIH pilot was very stimulating for us.
Granting us the computational time let us perform BindingDB validation very important to outline strong points and limit of SAFAN-ISP technology
The validation on BindingDB dataset will be used to promote SAFAN-ISP technology within biotech and pharmaceutical companies