Accelerating Translational Research across the RCMI Translational Research Network (RTRN) by providing integrated tools and an environment for sharing and mining large datasets

The RCMI Translational Research Network (RTRN) was recently awarded an ARRA Supplement Award from the National Institutes of Health (NIH).

The purpose of this supplement award is to enable RTRN to further support Data Technology Coordinating Center (DTCC) infrastructure improvements that will enhance the its investigators’ capacity to facilitate translational research through the use of large datasets, an important complement to accelerating health advances from bedside to communities. Currently RTRN investigators have access to select large datasets within their own institution; however, the array of datasets that they can access is limited, governing access varied and interfacing not standardized. In addition, there are challenges in locating and analyzing data to address a specific research problem because of the varied sources and quantity of data.

The enhanced infrastructure will provide RTRN with access, tools, and services for mining, manipulating, and analyzing large datasets, as well as providing feasibility testing and customized reporting through the DTCC. DTCC will be able to provide not only the new collaboratory technology, but also customized data services such as preliminary statistical feasibility testing, frequency reporting, and automated contributors’ reports. RTRN Investigators will have access to these services as well as web-based access to powerful tools for efficiently querying, integrating, and analyzing large datasets. Existing open source tools will be adopted and/or adapted as much as possible with DTCC programming and bioinformatics teams developing additional tools to meet the needs of the RTRN community. DTCC will deploy a common user interface that will enable to RTRN investigators to retrieve, sort and manage data from various locations, regardless of where the data is housed.

Benefits to RTRN Investigators:

  • Time and cost from idea to research outcome would be decreased, expediting translational research and increasing capacity to work in a collaborative virtual environment.

  • Non-regulated datasets will be available for general use within 12 months of project initiation.

  • A web-based tool with “single-point querying” that can be used on databases, collaborator profiles, and publications allowing researchers to identify potential support -further expediting the translation of ideas into discovery- will be available.

  • RTRN investigators will have access to training programs through the DTCC to ensure optimal utilization of the data sets retrieval system.

  • Trained research assistants will be available to provide full support in areas of bioinformatics, statistical programming and system administration.

Impact on Research

These efforts will enable RTRN to continue to assist in the creation of research synergies that will support interdisciplinary interaction and multi-site translational research. In addition, the Network will be better positioned to support the development of a new generation of scientists interested in further enhancing their skills, understanding and initiatives grounded in multi-disciplinary and translational research by utilizing the large datasets that can be shared across the RTRN community.

T1T2 – Large Data Sets Project Status

  • Pre-NOGA (July 17th - recommendation to fund)

    • Began preliminary assessment of RTRN needs, stakeholder interest, and existing solutions (no decisions, just gaining information)

    • Discussions with

      • JSU BICB leadership

      • RTRN informatics membership

      • Meharry

      • UMich

      • Harvard

  • NOGA (Sept 24th - notification to CDU)

    • Follow-up discussions with BICBWG leadership

      • Request RTRN Community to submit projects and indicate needs (not a survey, but simple request).

      • DTCC develop tools and/or packaging/presenting existing useful tools to the RTRN community

  • Proactively in early planning/needs assessment phase

    • Planning meeting with the University of Michigan, members of NCIBI and the RTRN Informatics Groups at Morehouse School of Medicine (November 2009)

      • Identified PROTEOMICS as first possible data set, then expand to other dataset types

    • Regularly scheduled planning and operations meetings (11 teleconferences to-date)

      • Webinar with RTRN Informatics Groups (November 2009)