SCHARE Platform Components

Triangles containing a series of images: two hands on a laptop keyboard with an overlay of cloud, document, book, file, check box, and lock symbols; a racially and gender diverse group of four colleagues having a brainstorming session with sticky notes; and a woman’s face with a 3D scanning interface.

The SCHARE platform is built on a set of established components (Google Cloud Platform, Terra and GitHub) used in flagship scientific projects at NIH.

SCHARE’s cloud-based platform contains:

Datasets relevant to health disparities, health care delivery, and health outcomes research, including social determinants of health and other social science behavioral data.
A project data repository for NIH-funded projects centered on Core Common Data Elements for enhanced data interoperability and compliance with NIH Data Management and Sharing policy.
Secure, collaborative workspaces and for researchers and relevant collaborators.
Computational capabilities for collaboratively evaluating, designing, and assessing fit-for-purpose utilization of datasets and algorithms to generate AI models that are effective and efficient.

Access the SCHARE platform

Registration is required to access the SCHARE platform, (learn more and register).

Datasets

STATUS: The SCHARE Datasets collection is accessible to all SCHARE-registered researchers. New datasets are being actively added.

SCHARE Datasets list

On SCHARE, researchers can access, link, analyze, and export a wealth of datasets relevant to research in health disparities and health care outcomes, including:

Public Datasets: publicly accessible, federated, de-identified datasets hosted by SCHARE or hosted by Google through the Google Cloud Public Dataset Program
- Examples: American Community Survey (ACS), Behavioral Risk Factor Surveillance System (BRFSS)
Project Datasets: publicly accessible and controlled-access, funded program/project datasets using Common Data Elements and shared by NIH grantees and intramural investigators to comply with the NIH Data Sharing Policy
- Examples: Forthcoming datasets such as the Jackson Heart Study (JHS)

Datasets are grouped by these categories:

Social Determinants of Health (SDOH) & Health Behaviors CDC categories: Economic Stability, Education Access and Quality, Health Care Access and Quality, Health Care Access and Quality, Neighborhood and Built Environment, Social and Community Context) and Health Behaviors
Diseases and Conditions
Clinical and EHR Data (coming)

Learn more about SCHARE Datasets and access the SCHARE platform (registration required).

SCHARE/PhenX Core Common Data Elements

STATUS: The SCHARE/PhenX Core Common Data Elements are available to all researchers through the National Library of Medicine.

Endorsed by the National Institutes of Health, the SCHARE/PhenX Core Common Data Elements (CCDEs) are standardized questions and responses that can be used across different studies to ensure consistent data collection and facilitate interoperability. CCDEs enable researchers to efficiently design data collection, management, and analysis plans; link data from different sources; and enable data harmonization to generate large datasets for AI use.

Learn more about the SCHARE Core Common Data Elements.

Data Repository

STATUS: The SCHARE Data Repository is available to SCHARE-registered researchers.

The SCHARE Repository enables researchers to meet the requirements of the NIH Data Management and Sharing policy, which requires the hosting, management, and sharing of data generated by NIH-funded research programs. SCHARE provides a repository for projects focused on population science topics, such as health disparities and public health outcomes. All SCHARE-registered users—including NIH-based researchers, external researchers, and public— can access data within the repository at varying privacy and security levels utilizing the controlled-access process. The SCHARE Repository utilizes core common data elements as a means to facilitate data aggregation for AI development that optimizes public health scientific knowledge discoveries and generates tools to monitor and improve health outcomes.

Access the SCHARE Data Repository.

Collaborative Workspaces

STATUS: The SCHARE Collaborative Workspaces are available to all SCHARE-registered researchers.

SCHARE is powered by Terra, an open-source data analysis platform based on Google Cloud Platform. Terra was developed by the Broad Institute of MIT and Harvard in collaboration with Microsoft and Verily.

Using SCHARE’s Terra resources, researchers and their collaborators can access and cross-link the same publicly available or controlled-access data. They can also create secure online spaces for collaboratively running large-scale analyses and sharing reproducible results and resources.

SCHARE supports interactive analysis tools such as Jupyter notebooks. Jupyter notebooks are human-readable executable documents that can be run to perform advanced data analyses, including artificial intelligence and machine learning tasks, using coding languages such as Python and R. The platform also supports Dockstore as a repository for Docker-based analysis workflows that allow users to automate basic steps in their analyses.

Register for SCHARE
Explore the SCHARE Terra Workspace, or create your own workspace using the tutorials accessible from our Tutorials and Resources page.

SCHARE-HEAN NAIRR Pilot Project

STATUS: The SCHARE-HEAN NAIRR Pilot Project is Active.

The National Artificial Intelligence Research Resource (NAIRR) is a vision for a shared national research infrastructure for responsible discovery and innovation in AI. The NAIRR pilot will run for two years, beginning January 24, 2024. The pilot broadly supports fundamental, translational and use-inspired AI-related research with particular emphasis on societal challenges and use adoption.

To support these efforts, the SCHARE-HEAN (a.k.a. Multiple Chronic Diseases Disparities Research Consortium) Pilot Project forms a unique collaborative relationship between community partners, academia, and SCHARE to use big data and cloud computing data science analytics to increase the prevention, treatment, and management of multiple chronic diseases, such as diabetes, obesity, hypertension, coronary heart disease, congestive heart failure, chronic kidney disease, stroke, and certain cancers. The data warehouse includes chronic disease, ascribed and acquired attributes, and relevant environmental and living conditions data, which is mapped to the SCHARE common data elements for increase data interoperability, and highlighted in Think-a-Thons to democratize data use adoption.

Learn more about the SCHARE-HEAN NAIRR Pilot Project.

Page updated March 12, 2025 | created Jan. 18, 2023

Get Involved

Access Platform

Learn More

Stay Updated

SCHARE Platform Components

Datasets

SCHARE/PhenX Core Common Data Elements

Data Repository

Collaborative Workspaces

SCHARE-HEAN NAIRR Pilot Project

Get Involved

Access Platform

Learn More

Stay Updated

SCHARE Platform Components

Datasets

SCHARE/PhenX Core Common Data Elements

Data Repository

Collaborative Workspaces

SCHARE-HEAN NAIRR Pilot Project

Staying Connected