Machine learning clearly interprets gene regulation, In the era of big data, artificial intelligence (AI) has become a valuable ally for scientists. For example, machine learning algorithms help biologists understand the number of dizzying molecular signals that control gene function.

As new algorithms are developed to analyze more data, they also become more complex and difficult to interpret. Quantitative biologists Justin B. Keeney and Omar Tarin have strategies for designing sophisticated machine learning algorithms that are more easily understood by biologists.

Algorithms are a kind of artificial neural network (ANN). Inspired by the way neurons connect and branch into the brain, ANN forms the basis for sophisticated machine learning. And despite its name, ANN is not used exclusively for brain research.

Cells don’t always need all of the protein. Instead, they rely on complex molecular mechanisms to change genes that produce proteins when needed or not. When these precautions fail, disruptions and illness usually occur.

Unfortunately, the way standard ANN is built from MPRA data is very different from the way scientists ask questions in life science. This difference means that it is difficult for biologists to interpret how gene regulation is achieved.

Now researchers have developed a new approach that bridges the gap between computing tools and the minds of biologists.

They have created personalized ANN that reflect common mathematical concepts in biology about the genes and molecules that control them. In this way, the pair basically forces their machine learning algorithm to process data in a way that can be understood by biologists.

This effort, Keeney explained, highlights how modern industrial AI technology can be optimized for use in life sciences.

Keene has tested this new strategy for creating personalized ANN and using it to study various biological systems, including the main genetic chains involved in human disease.