National Crystallography Service – Case Study

Working with a Research Software Engineer makes life so much easier for a scientist, says Simon Coles, director of the UK’s National Crystallography Service (NCS). 

The NCS offers a service to chemistry researchers across the UK, and receives over 1000 samples a year. These have to be managed, prioritised, analysed and returned to the right people with the information they need. The service’s Portal2 system, developed by the Southampton Research Software Group (SRSG), allows them to do that smoothly and to collect the usage data generated. 

“We began working on this back around the advent of the internet, just seeing what was possible,” says Coles. 

“We were an exemplar project on an EPSRC, RCUK eScience project, and we explored a whole bunch of things to see what worked and what didn’t. By 2008, we had honed in on what the essence of the project was, and got funding to let us build something that worked. The staff here have to really focus on the science, so I needed to bring in at least one software developer.”

That original developer was part of the crystallography team. “He basically took the previous pilot projects and brought them together to give us something that worked. And so we kind of limped along with this collection of software that kind of worked for seven or eight years, though there were so many workarounds that it was difficult to operate,” Coles says. 

Finally, however, Coles’ funders requested changes to the way that things were done, and it became clear that the NCS could not continue with this ‘Frankensteined-together’ system. 

“I’ve worked with [commercial] software engineers, and when you have to spend half a day explaining what a molecule is, it’s quite tough.”

Prof. Simon Coles – Director, National Crystallography Service

“That was when we got in touch with the Southampton Research Software Group. I had been aware of them from our previous work with the Software Sustainability Institute, [SRSG Co-Director Simon Hettrick is also Deputy Directory of the SSI] and had known [SRSG Co-Director] John Robinson for years. They were able to readily pick up our specifications, work with us to understand exactly what we need to do, and then iterate with us on different parts of the software,” he says. 

“They took our Frankenstein’s monster and worked with us to build a system that worked for us. And working with the SRSG’s RSEs is ideal, in that we’re dependent on this technology and our needs can change quite dramatically and quickly. There’s no way I could fund someone on my own team to be on top of that, but with the SRSG I’m able to dip into a pool of expertise. I can just say ‘Can we add this new functionality, to make it do this new thing?’ when I have the funds. It lets me resource it and keep it current, on an as-required basis,” Coles says.

The RSE team at the SRSG mixes easily with Coles’ team at the NCS and quickly picks up what is needed, he says. 

“They embed themselves in the scientific research teams and really are able to get to grips with what we do on a fundamental level. That means the product you get is actually tailored very well to what you do. You don’t have those agonising conversations where you’re trying to translate each other. We discovered, over quite a long, painful period of time, that you can be saying the same thing in two different languages and kind of miss each other… That doesn’t happen with the RSEs,” he says. 

With the SRSG I’m able to dip into a pool of expertise. I can just say ‘Can we add this new functionality, to make it do this new thing?’

Prof. Simon Coles – Director, National Crystallography Service

The SRSG’s focus on sustainable software has also been a real benefit to the NCS, Coles says. 

“With the previous system, there was one guy on the planet who knew what to do with it. And he’s long-gone, into a well-paid job, and hasn’t really the time to talk to us! So to have the SRSG ensure everything is on a common development platform, using established methodologies and tools and languages – to have that rigour in the whole systems design is great. 

“It’s not just about maintaining the system on its current tracks, either. Making changes and additions is part of sustainability too, in keeping the product not just technically current but functionally current,” Coles says. 

Plans for the future of the software involve improved data management, and potentially offering the code to other people who could make use of it. 

The NCS gathers enormous amounts of data from its experiments and much of it is currently lost, or at least unused, says Coles. Researchers don’t have time to think about how it can be stored and shared as they go about their daily work. 

“We are one of the highest volume, highest throughout facilities in the world, and we can’t keep up with what we should be doing with the data. All this data is accumulating in clouds and on hard drives and all over the place, and 75% of outputs just aren’t getting out into the public domain. All that really gets out is what’s associated with a journal article, and I can assure you, that’s a fraction of what actually gets done. 

“So you need a system to manage that. We’ve started talking about it and we’re trying to get some money to work on this. The system puts us in a good position to do it, to take out data and make sure it’s in a standard, reusable format, and then make it available. We can either send it to specific organisations who use or collate that sort of data, or even just make it openly available,” he says. 

Finally, the NCS and the SRSG are looking at what can be done with the software that has been developed. It became clear during the development process that the product was actually quite generic and could be used by other facilities and organisations. 

“We worked on the generic functionality and made that the kernel of the thing – and then added the tailored parts that make it work for us. But the principle is that we could make this more widely available,” Coles says.

Chemistry departments in universities will soon be able to make use of at least part of the code, he says, and other sciences may have a use for it. 

Coles would be happy for the software to be widely used, he says. 

“I don’t claim any ownership. I’m quite happy we’ve got something that works, and happy as long as I can say ‘This started here’. I’d stress that it was developed, trialled and tested here at NCS, but after that, I’m keen for it to be exploited by others.”