Collaborative Data Services

Shared Resources

The Collaborative Data Services Core (CDSC) is expressly designed to facilitate research use of patient data, leveraging Moffitt Cancer Center’s Health and Research Informatics (HRI) platform, an enterprise-wide data warehouse containing discrete data on more than 550,000 patients. The CDSC is a unique shared resource that helps drive innovation across all Cancer Center Support Grant (CCSG) research programs by serving as a gateway to MCC's robust data assets, spanning clinical, administrative, patient-reported, biospecimen, and molecular domains. The CDSC supports members with three primary services: study design consultation, provisioning of patient-level data, study specific medical record abstraction, and training to enable self-service queries. These services are designed to advance three specific aims:  

1) Assist investigators in the early stages of developing research projects with consultations on study design and project feasibility, including identification of patient cohorts based on detailed inclusion and exclusion criteria.

Basic, clinical, population and quantitative scientists are often unfamiliar with the breadth and depth of patient data available for answering a wide range of research questions.  Furthermore, employing novel study designs to optimize the use of discrete data in combination with manually abstracted data is a skillset with which not all investigators are equipped.  Faculty may request CDSC consultation to identify existing sources of data available to answer a specific research question and define a study design (retrospective cohort, nested case-control, etc.) that best leverages the existing data with or without manual chart abstraction. In addition to study design consultation, investigators may obtain sample size estimates for use in statistical power calculations.  Given that much of this work is conducted during the preparation of grant applications (i.e. prior to funding), the CDSC provides the first two hours of consultation free of charge.

2) Promote and facilitate cutting-edge translational research by providing members with access to high-quality discretely captured patient-level data from a variety of source systems, sometimes combined with clinical data, manually and cost-effectively abstracted from patient medical records;

Discrete data are available for Moffitt patients from a variety of sources, including the Cancer Registry, the electronic medical record, patient-reported medical histories and risk factor profiles, billing and procedures codes, and the laboratory information and clinical trials management systems.  Collectively, these sources provide information on patient demographics, cancer screening procedures, cancer diagnosis and staging, chemotherapy, radiation therapy, surgery, immunotherapy, comorbidities, medication use, lifestyle and other cancer risk factors, quality of life, vital status, consent to research protocols and availability of biospecimens and associated molecular data.  These data are all discretely available in HRI and often adequate for identifying cohorts of patients with biospecimens for basic science research and generating hypotheses regarding novel predictors of survival.  For other types of research requiring more detailed information on treatment response or disease progression, manual chart abstraction is often necessary.  The CDSC provides cost-effective abstraction, billed as an hourly chargeback.  The abstractors are trained to abstract only those data that are not already available in HRI; they follow standardized definitions when available, and quality control procedures are followed.  The investigator is provided with a final analytic data set, including the data discretely available through HRI, combined with the manually abstracted data.

3) Democratize access to de-identified data by providing individual or small group training on the use of self-service querying tools and querying best practices.

To empower researchers to query HRI directly, Moffitt has invested in the development and maintenance of self-service querying tool that provides a de-identified view of Moffitt patient data.  To obtain accurate counts of patients that meet specific eligibility criteria requires both knowledge of the querying tool itself, as well as knowledge of the underlying data sources and best practices for querying data across systems.  For example, the order of applied filters can impact the sample size estimates, as data from different systems are not always available for all patients.  Knowledge of these nuances improves the quality of the queries.  As such, the CDSC offers individual and group level training on querying tool use, subject matter expertise in specific source systems and associated best practices in querying.


Scientific Director:
Issam El Naqa
Core Manager:
Rodrigo Carvajal

This work has been supported in part by the Collaborative Data Services Core at the H. Lee Moffitt Cancer Center & Research Institute, a comprehensive cancer center designated by the National Cancer Institute and funded in part by Moffitt’s Cancer Center Support Grant (P30-CA076292)