IGVF (Impact of Genomic Variation on Function) is an NIH funded consortium bringing together laboratories that generate complex data types via novel experimental assays, often focusing at the single-cell level of gene expression. This work is extended and regularized by laboratories that integrate these unique data using computational analyses to discover the associations and networks between human variation, chromosomal elements and molecular phenotypes for the purpose of elucidating their complex relationship in human cells and tissues. The DACC’s participation enhances the data created by the consortium through the creation of structured procedures for the verification and validation of all submitted data and providing processes for the documentation of metadata that describe each biological sample and assay method. To facilitate access to all the data created, the DACC will construct a state-of-the-art data warehouse, design, and develop robust software to enable data submission, and harden unified data processing pipelines. All experimental and computational results will be made available via the IGVF Portal, developed by the DACC. The DACC is composed of labs from Stanford, Washington University in St. Louis, Northwestern, and Yale.  The power of these human genomic analyses cannot be realized until they have been integrated into standardized and reproducible collections of observations made available with defined metadata and robust tools. 

The Cherry lab focuses on the application of meticulous curation skills to the validation of very large high-quality human genome datasets from the IGVF project.  Our research is to define and apply rigorous methods to create the resources and to distribute this information for the discovery of new knowledge.  All these datasets are integrated via their metadata into a system developed and distributed via a public web resource the IGVF Data Portal ( developed by collaboratively by the Cherry lab and the Washington University teams. The application of this resource requires development of software methods that are useful to all biomedical researchers from computational biologist to clinical researchers.  Our challenge is to develop tools that provide all users access to these data with complete metadata and at the resolution desired.  The result of our work will be a truly useful human online research resource that will meet the user needs for human single gene studies and provide a summarization of many types of information such as regulatory elements and sequence variation. Our partners at Washington University and Northwestern are focused on data visualization and administrative needs of the consortium, Yale helps Stanford with our interactions with computational working groups and network models.

Funding is provided from the US National Institutes of Health, National Human Genome Research Institute via grant HG012012.