Cross-department collaboration at the Science and Technology Facilities Council (STFC) leads to the development of a machine learning-based solution for super-fast analysis of results, writes Rebecca Humble and Shikha Gianchandani.

Where analysis of a material science experiment previously took several years, it can now be done in a matter of days.

The confluence of different technologies is leading the fourth industrial revolution with data-driven discovery. Machine learning (ML) is steadily changing the landscape for everything from day-to-day interactions to the digital, physical, and biological worlds. A novel idea conceived at the Science and Technology Facilities Council tested whether an ML technique could be applied to a material science experiment. 

Success through collaboration

A collaboration between Scientific Computing’s, Keith Butler, and the ISIS Neutron and Muon Source’s, Toby Perring and Duc Le has led to the first-ever attempt to interpret Inelastic Neutron Scattering (INS) data using ML techniques, and also justify the reliability of those results. 

INS measures the atomic and magnetic motions in a single crystal of a complex material. The material used in the experiment was a half-doped manganite - a type of perovskite material. Perovskite materials exhibit attractive physical and chemical features (such as mobility of oxide ions and electronic conductivity) that make them favourable for research and commercialisation in areas such as battery technologies.

The aim of this particular experiment was to determine which of two competing models for the attractions between the magnetic atoms in the material was the correct one. However, collecting and analysing INS data is a laborious process.

An experiment that originally took up to three years to interpret by combing through the data by hand, can now use convolutional neural networks (CNNs), a popular ML technique that produces the same results in weeks, while still retaining reliable and interpretable information. 

So how does the CNN work?

Well, computational scientists create neural networks that can update and improve their performance as they develop and are exposed to more training data. CNNs are commonly used in applications including the facial recognition that allows you to unlock your phone just by looking at it. So, just as our human brains can distinguish objects just by looking at them, CNNs allow the computer to do the exact same thing.

When humans first look at an object, we don’t take the whole thing in. What we do take in are certain features that allow us to decide what the object is - CNNs do exactly the same thing. For example, if we were looking at a cat and a dog, we can instantly tell the difference.

So how do we distinguish between them? They both have similar features: one tail, four legs and two ears. Well, as we have developed, we have learnt to identify the features that allow us to distinguish between these two things, and many more like it! That’s exactly what the computer algorithm has also had to learn.

Initially, the scientists had to tell the computer what it was looking at and, in this case, it wasn’t cats or dogs but samples of either type A or type B. They not only had to program it in a way that it could learn, but also understand why it was type A or B. So, when it came to feeding in the real data, it could easily distinguish and classify the different materials, then feed out the results to the scientists, telling them which features of the collected data are most important for making that decision.

This meant that, instead of taking the team at ISIS years to classify their data, the computer could provide the answers in just a couple of weeks! As a result, they can go from doing the laboratory experiment to publishing a final paper within two months, rather than up to three years, and importantly it also allows them to understand the results.

The ML technique was produced through the STFC Scientific Computing platforms PEARL, SCARF and the STFC Cloud. The code has all been made open access and has been written in Python. The next steps are to develop the technique so it can be applied to a range of experiments, and then it can become a tool in any scientist’s analysis toolbox, benefiting the entire scientific community.