Q: What motivated you to check microbes in extreme environments and what are the challenges in studying them?
A: Extreme environments are great places to search for interesting biology. Growing up, I desired to be an astronaut, and the closest thing to astrobiology is the study of utmost environments on Earth. And the one thing that lives in these extreme environments are microbes. During a sampling expedition I took part in off the coast of Mexico, we discovered a colourful mat of microbes about two kilometers underwater that thrived since the bacteria breathed sulfur as a substitute of oxygen—but not one of the microbes I hoped to check would grow within the lab.
The biggest challenge in studying microbes is that the majority of them can’t be cultured, meaning the one solution to study their biology is thru a technique called metagenomics. My most up-to-date work is genomic language modeling. We hope to develop a computational system that can allow us to check the organism “in silico” as much as possible, simply using sequence data. A genomic language model is technically a big language model, except that unlike human language, language is DNA. It is trained in an identical way, only in biological language and never English or French. If our goal is to learn the language of biology, we should always exploit the variety of microbial genomes. Although we have now quite a lot of data and more samples have gotten available, we have now only just scratched the surface of microbial diversity.
Q: How can studying microbes in silico using genomic language modeling improve our understanding of the microbial genome, given how diverse microbes are and the way little we learn about them?
A: A genome consists of many tens of millions of letters. It's unattainable for a human to take a look at this and make sense of it. However, we will program a machine to segment data into useful pieces. This is how bioinformatics works with a single genome. But whenever you have a look at a gram of soil, which may contain hundreds of unique genomes, that's just an excessive amount of data to work with – it takes a human and a pc together to cope with that data.
During my doctoral and master's studies, we were just discovering latest genomes and latest lineages that were so different from anything that had been characterised or grown within the laboratory. These were things we simply called “microbial dark matter.” When there are quite a lot of uncharacterized things, machine learning might be really useful because we're just searching for patterns – but that's not the tip goal. We hope to map these patterns to evolutionary relationships between every genome, every microbe, and each instance of life.
So far we have now considered proteins as a separate entity – this brings us to an excellent level of knowledge, since proteins are related to one another by homology and due to this fact evolutionarily related things can have an identical function.
What is thought about microbiology is that proteins are encoded in genomes and the context during which that protein is sure – which regions lie before and after it – is evolutionarily conserved, especially when functional coupling exists. This makes perfect sense because if you will have three proteins that have to be expressed together because they form a unit, it is advisable to put them right next to one another.
What I would like to do is incorporate more of this genomic context into the way in which we search for and annotate proteins and understand the function of proteins, in order that we will transcend sequence or structural similarities and add contextual information to grasp how we understand proteins and hypothesize about their functions.
Q: How can your research be used to take advantage of the functional potential of microbes?
A: Microbes would be the best chemists on the planet. Harnessing microbial metabolism and biochemistry will result in more sustainable and efficient methods for producing latest materials, latest therapeutics and latest varieties of polymers.
But it's not nearly efficiency – microbes do chemistry that we will't even take into consideration. As we take into consideration how our world and climate are changing, it’ll even be very vital to grasp how microbes work and to have the ability to grasp their genomic structure and functionality. Much of the carbon sequestration and nutrient cycling is carried out by microbes; If we don't understand how a selected microbe can fix nitrogen or carbon, we can have difficulty modeling Earth's nutrient flows.
On the more therapeutic side, infectious diseases represent an actual and growing threat. As we expect concerning the future and the right way to combat microbial pathogens, it is de facto vital to grasp how microbes behave in several environments in comparison with the remaining of our microbiome.

