Postgraduate research

Predicting diabetes-related complications with machine learning techniques

Qualification: PhD

Application deadline: 31 May 2024

Start date: 23 September 2024 or 1 October 2024



  • Dr Sharmin Shabnam (
  • Dr Francesco Zaccardi  
  • Professor Kamlesh Khunti

Project description

Diabetes mellitus is characterised by chronic hyperglycaemia, which is associated with a higher risk of cardiovascular complications. Regular monitoring, management, and control of risk factors, such as glycated haemoglobin (HbA1c), blood pressure, and lipids within the recommended range, are critical in helping individuals maintain overall health and well-being and reducing the risk of further complications. Continuous monitoring generates a large amount of intra-individual longitudinal observations of blood glucose levels which can be used to track disease progression and predict diabetes-related complications. 
Recently, the rapid development of machine learning methods has resulted in their applications in various areas of healthcare-related research. This PhD project aims to apply different statistical models and machine learning (ML) algorithms (including classification and regression trees, support vector machines (SVM), k-nearest neighbour, gradient boosting machines, and supervised principal component analysis) to predict various diabetes-related complications and develop a risk stratification system that categorises patients with diabetes into risk groups, enabling personalised interventions and treatment plans to mitigate complications. The post holder will undertake different statistical analyses using the Clinical Practice Research Datalink (CPRD) database, which includes anonymized patient data from a network of GP practices across England, to identify key features (i.e., age, gender, ethnicity, and diabetes duration) which contribute to the risk of diabetes complications.

The student will be embedded within a team of experts in clinical diabetes, epidemiology, data science, and statistics, and receive training in a broad range of ML and statistical methods used to investigate cross-sectional and longitudinal real-world data, as well as methods for prognostic research (development and validation of predictive models) using ML and statistical modelling approaches.

The Ph.D. project will be integrated into a vibrant postgraduate research community within the Real-World Evidence Unit and the Diabetes Research Centre, University of Leicester, and help advance the aims of the National Institute of Health and Care Research Leicester Biomedical Research Centre (BRC) and East Midlands Collaboration for Leadership in Applied Health Research and Care (ARC).


  • Zhang, L., Shang, X., Sreedharan, S., Yan, X., Liu, J., Keel, S., ... and He, M. (2020). Predicting the development of type 2 diabetes in a large Australian cohort using machine-learning techniques: longitudinal survey study. JMIR medical informatics, 8(7), e16850.
  • Lai, H., Huang, H., Keshavjee, K., Guergachi, A., and Gao, X. (2019). Predictive models for diabetes mellitus using machine learning techniques. BMC endocrine disorders,19(1), 1-9.
  • Gray, L. J., and Khunti, K. (2013). Type 2 diabetes risk prediction—Do biomarkers increase detection? Diabetes research and clinical practice, 101(3), 245-247.
  • Abbasi, A., Peelen, L. M., Corpeleijn, E., Van Der Schouw, Y. T., Stolk, R. P., Spijkerman, A. M., ... and Beulens, J. W. (2012). Prediction models for risk of developing type 2 diabetes: systematic literature search and independent external validation study. BMJ 345.
  • Collins, G. S., Mallett, S., Omar, O., and Yu, L. M. (2011). Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC medicine,9(1), 1-14.



ARC funded studentship available to UK and overseas* applicants.

The funding includes:

  • Home/UK tuition fees for 3 years
  • 3 year stipend at standard UKRI rates - For 2023/4 the annual stipend is £18,622
  • £1,500 pa RTSG for 3 years

*International students can apply if they are able to fund the difference between UK and Overseas fees. For 2023/4 the difference in fees will be £17,138 per year.

Entry requirements

Entry requirements

Applicants are required to hold/or expect to obtain a UK Bachelor Degree 2:1 or better in a relevant subject.

You will need a good honours degree and a Master’s degree in Statistics, Biostatistics or Data Science.

The University of Leicester English language requirements apply where applicable.

Informal enquiries

Informal enquiries

Project/funding enquiries to the project supervisor.

Application enquiries to

How to apply

How to apply

To apply please use the 'Apply' button at the bottom of this page and select May 2024.

We will amend the start date at the point of offer when a start date has been agreed.

With your application, please include:

  • CV
  • Personal statement explaining your interest in the project, your experience and why we should consider you
  • Degree certificates and transcripts of study already completed and if possible transcript to date of study currently being undertaken
  • Evidence of English language proficiency if applicable
  • In the reference section please enter the contact details of your two academic referees in the boxes provided or upload letters of reference if already available.
  • In the funding section please specify PHS ARC Shabnam 
  • In the proposal section please provide the name of the supervisors and project title (a proposal is not required)



UK and overseas applicants can apply.

Applicants who hold EU Settled or Pre-Settled status may be eligible for UK fees. Please email us a share code so that we can verify your status (The share code we need starts with S)

Back to top