Machine learning, and how it helps researchers make scientific discoveries much faster

What used to take months of painstaking research can now take mere hours, thanks to artificial intelligence that sifts through reams of complex data to reveal key information.

Jillian Buriak and her team spent years developing cost-effective plastic solar cells that can be printed like newspapers. Then she chatted with fellow chemistry researcher Arthur Mar, and in a just a few weeks his machine learning team enabled her group to boost the efficiency of these solar cells by 30 per cent.

"That was a big wake-up call for us," said Buriak. "All kinds of scientific discoveries are starting to happen faster than they used to."

Common machine learning analyses

Regression: Building relationships between variables, like predicting crop yield based on rainfall.

Clustering: Grouping data points based on common features, like grouping people together based on whether or not they're wearing hats.

Feature extraction: Examining data and identifying features that appear to affect their function, like identifying that people wearing a store's uniform are more likely to help you find a product.

Unsupervised: Allowing the machine to identify features by which to cluster or arrange data points, such as allowing it to cluster either by uniforms or hats.

Supervised: Programming the machine to conduct an analysis based on specific features, such as requiring it to cluster by hats instead of uniforms.

Machine learning is accelerating discoveries in countless areas of research, and Mar and his team are among the University of Alberta's many pioneers in the field.

They're not 'terminators'

Pop culture offers many ideas about what "machine learning" means, but to Mar it's just a set of tools.

"Our kind of machine learning is not terminators," he said with a laugh.

Machine learning sorts and categorizes complex sets of data to tease out useful information.

Mar explains: "If you needed help getting a heavy box off the top shelf at a store, you could analyze the people around you to see who would help. You could target people wearing the store uniform, and then you could rank them based on a relevant attribute like height. Machine learning will do similar clustering and ranking, but can handle a lot more information than any of us could process. It can also identify more relevant attributes-it could tell you that an employee's height is less important than their access to a ladder, and rank accordingly."

For Buriak's solar cells, the machine was given years of experimental lab data and programmed to look for different design variables that could affect the efficiency of an organic solar cell.

"Using the traditional method of changing one variable at a time, we'd have needed thousands of experiments to screen all those possible combinations," Buriak said. "The machine learning algorithm helped us understand which variables mattered most, and just 16 experiments later, we were on our way to systematically increasing the efficiency of solar cells in a dramatically accelerated fashion."

You only need a laptop

Engineering professors Arvind Rajendran, Vinay Prasad and Zukui Li lead a team using machine learning to optimize processes for capturing CO2 before it can be emitted from power plants.

"Our carbon capture process could have 9,000 different configurations per material used," Prasad said. "We need to know which potential adsorbent is most effective in which configuration."

Machine learning allows the team to quickly eliminate thousands of possible configurations that could never meet the U.S. Department of Energy's requirement for carbon capture technology to remove 95 per cent of CO2 from emissions.

"Individually modelling each of those configurations would require immense computing power over months," Prasad pointed out. "With machine learning and a limited amount of training data from detailed simulations, we only need a laptop and a few hours."

The benefits of machine learning have been noticed by experts in many disciplines. In August, Mar's group partnered with Prasad's team to offer researchers affiliated with the U of A's Future Energy Systems research initiative two do-it-yourself machine learning workshops. Both were sold out before they were advertised, with participants including physicists, microbiologists, economists, and even administrators. More workshops are now being considered and Prasad is offering a special graduate course on the subject.

"We've used these techniques to analyze everything from the monitoring of oilsands tailings ponds to the qualities of grain that will make popular beer," he said. "If you have data, machine learning is a tool that can help you focus your efforts."

Not replacing people

From Buriak's perspective, the rise of machine learning is a necessary shake-up for research in many fields, and her team is taking full advantage.

"Using these techniques, we're in the process of developing some truly new solar power systems," she said. "We're on track to share those technologies in the near term."

She doesn't assign any dates to the near term, but the discoveries will certainly happen sooner than if her team had stuck with traditional methods.

To Mar, that's the point.

"We're saving time and money by reducing the number of experiments needed to get to a discovery," he said. "We're not replacing the people doing the experiments just yet."