A large part of our current work is focused on discerning roles played by various cellular biomolecules in regulating gene-expression, especially at the transcriptional level. Specifically, we develop methods to gain biological insights from large-scale genome-level data.
Advances in high-throughput sequencing technologies have resulted in the development of novel sequencing-based assays, which measure various biochemical activities along the genome. Some examples are ChIP-seq (which profiles protein-bound regions), ATAC-seq (which identifies open chromatin), STARR-seq (which assesses enhancer-activity of DNA), GRO-seq (which measures nascent transcripts); and the list continues to grow. These experiments typically identify thousands of regions displaying a certain kind of activity. Our group designs algorithms to learn the underlying sequence components within those regions, which may be responsible for that activity. See Publications and related Software for more details.
We are also interested in developing statistical algorithms to learn from large, heterogeneous datasets in other domains, such as healthcare.
We are grateful for funding from IISER Pune, the Department of Biotechnology (DBT) India, and the Bill and Melinda Gates Foundation (BMGF). In the past we have been funded by CSIR-NCL, SERB India, and Wellcome Trust-DBT IA.
We have open positions under a DBT funded project on using machine learning for regulatory genomics.01 September 2021
Anushua's paper on deciphering protein-DNA footprints from ChIP-exo experiments is published in Bioinformatics.01 June 2021