He received his masters’ degree in computer science at the University of Twente in 1992 and completed his PhD on Formal operation definition in object-oriented databases in 1997. His research targets robustness in data science focusing on two main threats to data science reliability: data quality and undesirable machine learning behaviour. The former is focused on data integration, semi-structured data, natural language processing, and data quality issues involved in these. He co-developed one of the most scalable XML database systems of its time: MonetDB/XQuery. Furthermore, he proposed a data integration approach, called Probabilistic Data Integration, which fundamentally incorporates handling of uncertain and of lesser quality data. He developed a probabilistic database system, called DuBio, which allows the scalable storage, manipulation and management of such uncertain data. On the threat of undesirable machine learning behaviour, he focuses on Explainable AI with the intrinsically explainable deep learning approach ProtoTree as one of the notable results of this. He is secretary of the executive board of the EDBT Association (Extending Database Technology). He is the (co-) author of about 200 publications that accumulated about 2000 citations.
Engineering & Materials Science
# Data Integration # Metadata # Ontology # Radiology # Semantics # Uncertainty # Xml
# Breast Cancer
Sohail, S. A. , Bukhsh, F. A. , van Keulen, M. , Krabbe, J. G., & Hruby, P. (2022). Evaluating Clinical-Care Metadata Share and its FAIRification using the REA Ontology. In H. Weigand, T. Prince Sales, & P. Johanesson (Eds.), VMBO 2022, Value Modelling and Business Ontologies 2022: Proceedings of the 16th International Workshop on Value Modelling and Business Ontologies (VMBO 2022), held in conjunction with the 34th International Conference on Advanced Information Systems Engineering (CAiSE 2022), June 06–10, 2022, Leuven, Belgium (CEUR Workshop Proceedings; Vol. 3155). CEUR. http://ceur-ws.org/Vol-3155/paper3.pdf
Dijkstra, J-J. (2021). Zero-downtime schema changes. University of Twente.
Kippers, R. , Koeva, M. N. , van Keulen, M. , & Oude Elberink, S. J. (2021). Automatic 3D building model generation using deep learning methods based on cityjson and 2D floor plans. In L. Truong-Hong, E. Che, F. Jia, S. Emamgholian, D. Laefer, & A. V. Vo (Eds.), The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Vol. XLVI-4-W4, pp. 49-54). (International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences). Copernicus. https://doi.org/10.5194/isprs-archives-XLVI-4-W4-2021-49-2021
Sohail, S. A. , Bukhsh, F. A. , & van Keulen, M. (2021). Multilevel privacy assurance evaluation of healthcare metadata. Proceedings (MDPI), 11(22), . https://doi.org/10.3390/app112210686
Mauritz, R. R., Nijweide, F. P. J. , Goseling, J. , & van Keulen, M. (2021). Autoencoder-based cleaning in probabilistic databases. arXiv.org. https://arxiv.org/abs/2106.09764
Sohail, S. A. , Bukhsh, F. A. , van Keulen, M., & Krabbe, J. G. (2021). Identifying Materialized Privacy Claims of Clinical-Care Metadata Share using Process-Mining and REA ontology. 111-120. Paper presented at 15th International Workshop on Value Modelling and Business Ontologies, VMBO 2021, Virtual Workshop. http://ceur-ws.org/Vol-2835/paper12.pdf
Nguyen, E., Theodorakopoulos, D. , Pathak, S., Geerdink, J., Vijlbrief, O. , van Keulen, M. , & Seifert, C. (2021). A Hybrid Text Classification and Language Generation Model for Automated Summarization of Dutch Breast Cancer Radiology Reports. In 2020 IEEE Second International Conference on Cognitive Machine Intelligence (CogMI) (pp. 72-81).  IEEE. https://doi.org/10.1109/CogMI50398.2020.00019
Provoost, J. C. , Kamilaris, A. , Wismans, L. J. J., van der Drift, S. J. , & van Keulen, M. (2020). Predicting parking occupancy via machine learning in the web of things. Internet of Things, 12, 100301. https://doi.org/10.1016/j.iot.2020.100301
Bellatreche, L., Bentayeb, F., Bieliková, M., Boussaid, O., Catania, B., Ceravolo, P., Demidova, E., Halfeld Ferrari, M., Lopez, M. T. G., Hara, C. S., Kordić, S., Luković, I., Mannocci, A., Manghi, P., Osborne, F., Papatheodorou, C., Ristić, S., Sacharidis, D., Romero, O., ... Zumer, M. (2020). Databases and Information Systems in the AI Era: Contributions from ADBIS, TPDL and EDA 2020 Workshops and Doctoral Consortium. In L. Bellatreche, M. Bieliková, O. Boussaïd, J. Darmont, B. Catania, E. Demidova, F. Duchateau, M. Hall, T. Mercun, M. Žumer, B. Novikov, C. Papatheodorou, T. Risse, O. Romero, L. Sautot, G. Talens, & R. Wrembel (Eds.), ADBIS, TPDL and EDA 2020 Common Workshops and Doctoral Consortium - International Workshops: DOING, MADEISD, SKG, BBIGAP, SIMPDA, AIMinScience 2020 and Doctoral Consortium, Proceedings (pp. 3-20). (Communications in Computer and Information Science; Vol. 1260). Springer. https://doi.org/10.1007/978-3-030-55814-7_1
Ruis, F. , Pathak, S., Geerdink, J. , Hegeman, J. H. , Seifert, C. , & van Keulen, M. (2020). Human-in-the-loop Language-agnostic Extraction of Medication Data from Highly Unstructured Electronic Health Records. In 20th International Conference on Data Mining Workshops 2020 IEEE EDS.
UT Research Information System
Google Scholar Link
Affiliated Study Programmes
Courses Academic Year 2021/2022
Courses in the current academic year are added at the moment they are finalised in the Osiris system. Therefore it is possible that the list is not yet complete for the whole academic year.