Translational Data Tools

We build data infrastructure, applications and data products to support translational use of patient data in a secure, scalable, collaborative way to bridge the gap between the clinic and research.

To do this, we partner with a variety of departments across Fred Hutch to help coordinate and build scalably. This work includes aligning our technology resources with data community needs and ensuring connections between subject matter experts in issues such as technology, security and compliance to enable ethical use of data resources at Fred Hutch.

All documentation about Data Science Lab Tools and best practices can also be found on the Fred Hutch Biomedical Data Science Wiki, or you can join our Translational Data Noon Community Calls to learn about upcoming resources.

Applications and Tools

PROOF

PROOF (PRoduction On-ramp for Optimization and Feasibility) is a user-friendly tool designed for managing and executing WDL (Workflow Description Language) workflows using the Cromwell workflow manager, configured to run on the Fred Hutch cluster. PROOF allows users to:

Automate all the backend Cromwell configurations necessary to run your workflows instantly.
Validate, troubleshoot, assess performance, and run their workflows all under one roof.
Refine their workflows before potential transitions to cloud-based infrastructures, providing a “proofing” resource of sorts.

Find out more on the PROOF page on the Fred Hutch SciWiki.

cBioPortal

cBioPortal is an open-source platform for visualization, exploration, and analysis of cancer genomics data sets developed by Memorial Sloan Kettering (MSK). The Data Science Lab (DaSL) in conjunction with the Scientific Computing Group (SciComp) at Fred Hutch have launched a Fred Hutch instance of cBioPortal for research use only.

The Fred Hutch deployment of cBioPortal provides users with several advantages to enhance their research:

The private nature of this instance allows researchers to utilize the powerful visualization tools of the application to facilitate visualization of internal data.
It also provides research groups with the ability to establish controlled access to their study data, ensuring data protection.
Users can host and share their data through the Fred Hutch instance of cBioPortal to facilitate collaborations.
Additionally, this instance and our administration plans have been reviewed and approved by InfoSec to allow users to include individually identifiable research data in their data uploads when investigators have clear IRB approvals in place.

For more details on what the Fred Hutch instance of cBioPortal can do for you, read our Fred Hutch cBioPortal product page on the Fred Hutch SciWiki and visit cbioportal.fredhutch.org to give it a spin.

Translational Data Platform

We are building a secure, scalable, and supportable technical infrastructure foundation and tools that provide a centrally governed, democratized translational data platform, Clinical and Research Data System (CARDS). This will allow the Fred Hutch community to build data products and tooling leveraging multimodal Fred Hutch patient clinical data to support clinical trials, precision oncology, data science, research and AI innovation.

Databricks

Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, data science and AI solutions at scale and is hosted in AWS at Fred Hutch. Databricks provides tools that helps our CARDS engineering staff connect relevant Fred Hutch patient data sources to one access point for a variety of data users. Databricks then provides those data users with tools to process, share, analyze, and model datasets using common tools and languages like Notebooks, SQL, Python and R.

Note, wider access to Databricks is planned for 2026 as we continue to mature our data platform and ingest data sources of utility for research. If you’d like to discuss in more detail what resources you are looking for and inquire about status of Databricks access, please schedule a General Data House Call.