HPC Case Study: Deep learning identification of Crannogs

Aerial view of a Scottish crannog.
Image source: Professor Fraser Sturt

There are hundreds of crannogs in the wilds of Scotland, but they’re not easy to find – even when you know what you’re looking for.

These elusive historical structures – which rest in bodies of water, connected to the shore by bridges or causeways – have been part of the Scottish landscape for thousands of years. The oldest known example, Eilean Dòmhnuill, is located in the distant reaches of Outer Hebrides, and dates to between 3200 and 2800 BC.  

Searching for crannogs might seem like an esoteric pursuit, but data on their number and whereabouts allows archaeologists to build a picture of human behaviour and demographics across a vast geographical area and timescale. 

This enormous scope is what makes them so difficult to locate.

Not only are crannogs tucked away in some of the remotest parts of the UK, but they come in many different forms and states of repair, as would be expected of a dwelling type that was invented before the construction of the Great Pyramid of Giza, and remained in production until around the time that Newton was establishing the laws of motion. 

This is how academics working with the University of Southampton are managing to track down Scotland’s crannogs using the Iridis 5 high-performance computing (HPC) cluster.

Background

Dr Alexandra Karamitrou

Dr Alexandra Karamitrou, a geophysicist specialising in remote sensing and archaeology, has been working with the University of Southampton’s Department of Archaeology to identify crannogs in Scotland using a combination of imagery, artificial intelligence, fieldwork, and HPC.

The first – and biggest – part of the project,” begins Dr Karamitrou, “was to find out how many islets there are in Scotland”.

This was no mean feat. Scotland is home to more than 31,000 lochs, so the number of islets they collectively contain isn’t the sort of knowledge even the most industrious researcher can obtain in the field. Dr Karamitrou and her collaborators – Professor Fraser Sturt, Professor Duncan Garrow, Dr Stephanie Blankshein, and Mrs Angela Gannon – needed visual and topographical data to work from. 

It came in three different forms: aerial photography, which offers the sharpest resolution at 25cm; Colour-Infrared (CIR) imagery, which gives a resolution of 50cm; and radar imagery from the Sentinel-1 satellite, which can only manage a resolution of around 5m. 

Using these resources, along with freely available and extensive (though incomplete) coastal line data, Dr Karamitrou was able to create a complete visual overview of Scotland, representing the entire country and its lochs at varying degrees of resolution. 

Each type of visual data came with its own opportunities and challenges. While aerial photography produced the highest resolution imagery, it tended to also capture glinting sunlight, algae, and shallow water, leading to misidentifications by the algorithm. Radar imagery, on the other hand, was simply too low resolution to reliably capture objects smaller than 5m.

To distinguish water from land, meanwhile, the team had to use differences in colour to binarise the imagery. The test here was to binarise at the correct sensitivity so as to avoid visual noise while accurately distinguishing islets from the water around them.

In order to cleanse the data of these misidentifications, Dr Karamitrou and her collaborators trained a machine learning algorithm called a Faster Region Based Convolutional Neural Network (RCNN) with negative and positive instances of islets based on existing data, eventually yielding an accuracy rate of 99%.

This gave them a more or less complete list. “The algorithm identified around 7,000 islets in Scotland.” 

Deep learning identification of Scottish crannog sites.

Finally, the team trained the same RCNN algorithm with negative and positive instances of crannogs, allowing them to identify around 500 morphologically diverse candidates among the thousands of islets. 

All that remains is to verify them in the field. “You have to go there, visit the islet, to be sure that this is a crannog”.

At least, she adds, “that’s how it’s been done so far. But we are hoping through statistical analysis to see if there are any more patterns that these crannogs follow, so we can be able to identify them with artificial intelligence”.

How HPC helped

While HPC is often utilised for its prodigious processing power, Dr Karamitrou needed to exploit another of Iridis 5’s attributes: memory.

“It was a really big project. We had to use a lot of imagery, a lot of storage, and Iridis was amazing”.

Dr Alexandra Karamitrou

For reference, while a half-decent home computer might have between 500 gigabytes and 2 terabytes of storage, Iridis currently has 2.2 petabytes – 2,200 terabytes. And these numbers are set to be eclipsed when Iridis 6 comes online next January. 

But would the work even have been possible without access to an HPC cluster? In a word, no. “I tried to do it initially on my computer,” she says. 

“I ran out of disk space very quickly”.

In fact, even Iridis 5 struggled to keep up with the demands of the project, which involved storing and processing many terabytes of visual data. “I had two days of running time,” Dr Karamitrou explains. “That was my limit. And in many cases I had to break the data down into batches. I can’t imagine how long it would take on a laptop”. 

In the modern academic environment, where increasingly powerful HPC clusters are available to a growing number of researchers, it’s not worth finding out.

“You need to do it quickly and get the results”.

Accessing HPC

Like many University of Southampton academics who have made use of Iridis 5 in the past, Dr Karamitrou has a partial background in coding, having developed image processing algorithms for her PhD in geophysics. 

“I’m not a computer scientist,” she says, but thanks to a significant degree of relevant experience, a great deal of online research, and a one-day course laid on by the University of Southampton, she was able to implement the algorithms she needed for her research. 

Nevertheless, she acknowledges that a specialist coder might have made her research even faster and more effective.

“The code performed quite well, and it ran quickly, but I’m sure if I had the help of someone more expert it would have minimised the amount of time it took the code to run, and certainly I’d have had results more quickly, and maybe even better, if I had had help”.

Dr Alexandra Karamitrou

Crucially, this help now exists. 

In the last year the University of Southampton has acquired a dedicated team of HPC RSEs – Research Software Engineers whose role it is to enable academic researchers to take full advantage of HPC in their work, whatever their field of study and level of coding experience.
For more information, including details on how to get help from the HPC
RSEs free of charge, see: https://rsg.southampton.ac.uk/hpc