Various studies on using semantic techniques have been conducted to facilitate data integration and electronic data interchange in biomedicine. However, there have been limited applications to biobanks so far. Due to the growing amount of data and research collaboration in this area, there is an increasing demand for data integration and harmonization, both on semantic and technical level, which is reflected by ongoing research activities. Ontologies and standards were implemented for the purpose of sharing and harmonizing biobank data in common IT-platforms enabling biobank administrators to make their biobank available to the public, thereby improving the availability of relevant samples and transnational collaboration in research.
This thesis investigates computer-assisted, semantic approaches for comparing, integrating and sharing biobank data across heterogeneous IT-systems and databases and provides different IT-based solutions which are described in the studies presented below in this thesis. The first research question was on how to identify, standardize and share biobank data resources from a distributed, hospital-wide biobank in a common research infrastructure. For this, an existing international standard data model and IT-healthcare analysis method was applied to this context. Based on these methods, we implemented an extended standard data model which was used in a common hospital-wide biobank registry for sharing sample collections stored in different databases and information systems. Secondly, we explored the applicability of natural language processing and query expansion techniques for the evaluation of (bio-) medical ontologies for the biobanking domain. This resulted in a semi-automated evaluation approach for the evaluation of (bio-) medical ontologies based on competency questions. A third study explores the applicability of standard medical terminology concepts to annotate free-text data within biobank platforms. We implemented a prototype of a graph-based, semi-automated concept recommendation. The fourth study tackles the challenges of the harmonization and electronic data interchange of data from regional biobanks in common national and European research infrastructures.
In this thesis, we demonstrate the transferability of semantic data harmonization approaches in the field of bioinformatics to the biobanking domain. Furthermore, we are confident that the approaches demonstrated in this thesis can be reused in biobanking and other medical research fields.