Jan 2026

Abstract:
The complete blood count (CBC) is one of the most fundamental
clinical tests worldwide. It is low cost, globally standardised, and
widely used to diagnose haematological and non-haematological
conditions. Automated CBC instruments analyse blood by impedance and
flow cytometry, producing approximately 17 parameters. With 3.6 billion
tests performed annually, the CBC is central to routine healthcare, yet
most clinicians routinely consider only haemoglobin concentration and
total white cell and platelet counts, leaving the remaining parameters
to haematology specialists.
Blood cell parameters are remarkably stable within individuals, reflecting personalised setpoints largely determined by genetics. Genome-wide association studies have identified thousands of DNA variants associated with CBC parameters, most located in non-coding regions. Recent work has shown that raw flow-cytometric scattergrams contain far more information than the summary parameters stored in electronic health records, including signatures related to thrombotic risk and cardiovascular disease.
These insights, together with the discovery of SARS-CoV-2 signatures in CBC scattergrams, led to the founding of the BloodCounts! consortium in 2021. The consortium uses raw flow-cytometric data, over 300 intermediate CBC variables, and blood smear images to develop deep generative foundation models for blood, unlocking information that is routinely discarded.
Using proof-of-principle models—DeepCBC for CBC data and CytoDiffusion for blood smear images—BloodCounts! researchers have demonstrated substantial diagnostic potential. The models capture representations of biological sex, early pregnancy, blood group differences, iron-deficiency anaemia, cancer, and infection type, often outperforming existing CBC-based diagnostics.
To ensure global equity, the consortium developed FedMap, a federated learning platform enabling hospitals across Africa, India, Europe, and Singapore to co-develop local models while retaining data in trusted environments. Together, these advances show that foundation models of blood can markedly enhance the diagnostic and predictive power of the CBC at minimal cost and without disrupting clinical workflows.
Bio:
Willem H Ouwehand received his PhD and medical degree from the
University of Amsterdam in 1984. After a 5 years stint at the Sanquin
laboratory in Amsterdam, he accepted an academic position at the
University of Cambridge. He is now emeritus Professor of Experimental
Haematology and holds Honorary Consultant Haematology positions at
Cambridge University Hospitals, University College London Hospitals and
NHS Blood and Transplant. He is a Fellow of the Academy of Medical
Sciences and emeritus NIHR Senior Investigator.
His research on rare diseases focussed on inherited haemostasis disorders. As one of the founders of the NIHR BioResource, Willem did lead the NIHR BioResource whole genome sequencing pilot study for Rare Diseases for the 100 000 Genomes Project. With Professor Nicole Soranzo, he unravelled the genetic architecture of blood cell formation by genome wide association studies for complete blood count (CBC) parameters in large-scale population studies, including the blood donor health cohorts INTERVAL, COMPARE and STRIDES.
With Ass Professor Nicholas Gleadall, he founded the Blood transfusion Genomics Consortium (www.bgc.io), with participating institutes in 14 countries, which has developed and validated an affordable array DNA test for high throughput blood cell antigen typing with the aim to improve the matching between donor and recipient. He also supports him and Professor Michael Roberts for the BloodCounts! consortium project (www.bloodcounts.org), which develops foundational models for blood using the CBC and blood smear images from millions of patients in The Gambia, Ghana, The Netherlands, Singapore and the UK. All together foundation models of blood will enhance the operational efficiency of the CBC test and amplify its diagnostic, predictive, and inferential power, without disrupting clinical pathways, and at minimal cost.
Willem lives in Cambridge / London, is married to Sally with whom he has two daughters.