Interpretable Machine Learning for Predicting Multiple Sclerosis Conversion from Clinically Isolated Syndrome (2024)

Type of publication:
Journal article

Author(s):
Daniel E.C.; Tirunagari S.; Batth K.; Windridge D.; *Balla Y.

Citation:
medRxiv. (no pagination), 2024. Date of Publication: 19 Jul 2024. [preprint]

Abstract:
Background: Machine learning (ML) prediction of clinically isolated syndrome (CIS) conversion to multiple sclerosis (MS) could be used as a remote, preliminary tool by clinicians to identify high-risk patients that would benefit from early treatment. Objective(s): This study evaluates ML models to predict CIS to MS conversion and identifies key predictors. Method(s): Five supervised learning techniques (Naive Bayes, Logistic Regression, Decision Trees, Random Forests and Support Vector Machines) were applied to clinical data from 138 Lithuanian and 273 Mexican CIS patients. Seven different feature combinations were evaluated to determine the most effective models and predictors. Result(s): Key predictors common to both datasets included sex, presence of oligoclonal bands in CSF, MRI spinal lesions, abnormal visual evoked potentials and brainstem auditory evoked potentials. The Lithuanian dataset confirmed predictors identified by previous clinical research, while the Mexican dataset partially validated them. The highest F1 score of 1.0 was achieved using Random Forests on all features for the Mexican dataset and Logistic Regression with SMOTE Upsampling on all features for the Lithuanian dataset. Conclusion(s): Applying the identified high-performing ML models to the CIS patient datasets shows potential in assisting clinicians to identify high-risk patients.

Link to full-text [open access - no password required]

A stroke-like presentation due to balo concentric sclerosis (2018)

Type of publication:
Conference abstract

Author(s):
*Albuidair A.

Citation:
European Stroke Journal; May 2018; vol. 3, Supp 1

Abstract:
Background and Aims: A young woman presented with a 'stroke like' episode subsequently found to be due to a rare form of multiple sclerosis, Balo concentric sclerosis (BCS). Method: A literature search was conducted (5/ 1/2018) using the key words: 'Balo concentric sclerosis ' and 'Stroke ' finding only 30 PUbMed and 5 Medline references respectively. Few case reports exist of such a presentation. Results: The Hungarian neuropathologist Josef Balo published a case report in 1928 of a young man with a new hemiparesis who was found at autopsy to have lesions described as encephalitis periaxialis concerntrica. With the advent of MRI, imaging characteristically shows an onion ring or whorled appearance. Recently it has been classified to lie within the spectrum of atypical idiopathic inflammatory demyelinating disorders, and practically is considered as a form of relapsing-remitting MS. It is more common in Chinese and Filipino populations with an estimated 2:1 female predilection with on-going uncertainty as to the relative role of genetic or environmental predespositions. We describe a 33 year old lady presenting acutely with left arm heaviness, incoordination and paraesthesia. She had no vascular risk factors and no relevant past medical or family history. MRI confirmed a classical BCS ringed lesion within the white matter of the right frontal gyrus, Lumbar puncture showed raised lymphocytes and oligoclonal bands. Conclusion: Stroke-like presentations are not uncommonly found to be due to MS but rarely of the atypical BCS type. BCS shows a characteristic onion ring appearance on MRI.

Validating the portal population of the United Kingdom Multiple Sclerosis Register (2018)

Type of publication:
Journal article

Author(s):
Middleton R.M.; Rodgers W.J.; Akbari A.; Tuite-Dalton K.; Lockhart-Jones H.; Griffiths D.; Noble D.G.; Jones K.H.; Ford D.V.; Chataway J.; Schmierer K.; Rog D.; Galea I.; Al-Din A.; Craner M.; Evangelou N.; Harman P.; Harrower T.; Hobart J.; Husseyin H.; Kasti M.; Kipps C.; McDonnell G.; *Owen C.; Pearson O.; Rashid W.; Wilson H.

Citation:
Multiple Sclerosis and Related Disorders; Aug 2018; vol. 24 ; p. 3-10

Abstract:
The UK Multiple Sclerosis Register (UKMSR) is a large cohort study designed to capture 'real world' information about living with multiple sclerosis (MS) in the UK from diverse sources. The primary source of data is directly from people with Multiple Sclerosis (pwMS) captured by longitudinal questionnaires via an internet portal. This population's diagnosis of MS is self-reported and therefore unverified. The second data source is clinical data which is captured from MS Specialist Treatment centres across the UK. This includes a clinically confirmed diagnosis of MS (by Macdonald criteria) for consented patients. A proportion of the internet population have also been consented at their hospital making comparisons possible. This dataset is called the 'linked dataset'. The purpose of this paper is to examine the characteristics of the three datasets: the selfreported portal data, clinical data and linked data, in order to assess the validity of the self-reported portal data. The internet (n = 11,021) and clinical (n = 3,003) populations were studied for key shared characteristics. We found them to be closely matched for mean age at diagnosis (clinical = 37.39, portal = 39.28) and gender ratio (female %, portal = 73.1, clinical = 75.2). The Two Sample Kolmogorov-Smirnov test was for the continuous variables to examine is they were drawn from the same distribution. The null hypothesis was rejected only for age at diagnosis (D = 0.078, p < 0.01). The populations therefore, were drawn from different distributions, as there are more patients with relapsing disease in the clinical cohort. In all other analyses performed, the populations were shown to be drawn from the same distribution. Our analysis has shown that the UKMSR portal population is highly analogous to the entirely clinical (validated) population. This supports the validity of the self-reported diagnosis and therefore that the portal population can be utilised as a viable and valid cohort of people with Multiple Sclerosis for study.

Clinical Validation of the UKMS Register Minimal Dataset utilising Natural Language Processing (2016)

Type of publication:
Poster presentation

Author(s):
Rod Middleton, Ashley Akbari, Hazel Lockhart-Jones, Jemma Jones, *Charlotte Owen, Stella Hughes, Richard Gain, David Ford

Citation:
IPDLNC 2016

Abstract:
Objectives
The UK MS Register is a research project that aims to capture real world data about living with Multiple Sclerosis(MS) in the UK. Launched in 2011, identified data sources were: Directly from People with MS (PwMS) via the internet, from NHS treatment centers via ‘traditional’ database capture and by linkage to routine datasets from the SAIL databank. Data received from the NHS, though ‘gold standard’ in terms of diagnosis, is dependent on clinical staff finding both time and information to enter into a clinical system. System implementations across the NHS are variable, as is clinical time. Therefore, we looked to other complementary methodologies.

Approach
The Clix enrich natural language processing (NLP) software was chosen to see if it could capture a portion of the MS Register minimum clinical dataset, the software matches clinical phrases against SNOMED-CT. 40 letters, from 2 NHS Trusts, from 28 patients were loaded. The letters were a mix of MS patients with differing disease subtypes and were dictated by Neurologists, Specialist General Practitioners and MS Specialist Nurses. 20 of the letters were in docx format and 20 as PDF.
The letters were parsed by a domain expert for clinical content, scored by data item for sensitivity and specificity. Next the output from the software was scored by another researcher to see if the 12 relevant clinical concepts from the Register dataset had been elicited. Lastly a ruleset was created to look for particular clinical concepts and scored in the same way.

Results
Of the 40 letters one failed to load, the rest were analysed for the specific data items. Date related items were clearly challenging, with only 7% of appointment dates being matched and 22% for date of diagnosis. MS Type (93.3%) and EDSS score (93.75%) were well recognised, additionally symptoms of MS that would be poorly reported in traditional databases were recognised, with fatigue being well highlighted (78.5%) and gait and walking issues (68.7%) Of concern, were a number of false positive results in DMT’s with 15% patients being identified as being on a DMT when this was just being ‘considered’.

Conclusion
The NLP pathway could be extremely useful for obtaining hard to capture clinical data for the Register. Further work is needed to reduce errors, even with the current minimal configuration, it's possible to ascertain MS Type, functional score of MS, current medication and potentially disabling symptomology within the condition.