Open Source: MSK Researchers Share Valuable Cancer Data

Computational biologist Nikolaus Schultz

Computational biologist Nikolaus Schultz is Head of Knowledge Systems in MSK’s Marie-Josée & Henry R. Kravis Center for Molecular Oncology.

Every month, more than 30,000 scientists from institutions around the world tap into the cBioPortal for Cancer Genomics, which contains data on tens of thousands of tumors. Sharing information to expedite discoveries and improving patient care everywhere have been the key goals since the cBioPortal was launched in 2008.

“Much of this data is complex, and we anticipated that it would be helpful to make it accessible and interpretable for biologists and clinicians who don’t have a background in computational biology,” says Nikolaus Schultz, who helped create the cBioPortal when he was a postdoctoral fellow in the Sloan Kettering Institute. “Our main goal was to find a way to make the data understandable for users,” adds Dr. Schultz, who now leads a lab in Memorial Sloan Kettering’s Human Oncology and Pathogenesis Program (HOPP) and is Head of Knowledge Systems in MSK’s Marie-Josée & Henry R. Kravis Center for Molecular Oncology.

The cBioPortal is an open-source database, meaning no licenses or fees are required to use the software or the data. Additionally, scientists at any institution can install the software on their own computers and feed it their own data. Users are also encouraged to contribute to the code and make the software better for everyone.

Growing a Multi-Institutional Partnership

The information in the portal really started to take off in 2014. That’s when MSK began analyzing patient tumors with MSK-IMPACTTM, a diagnostic test that looks for mutations in hundreds of genes linked to cancer. “This meant a shift to a lot more clinical data,” Dr. Schultz says. “In addition to making the data useful for cancer researchers working in the lab, we needed to make it useful for patient care.” Now, when a patient’s tumor tests positive for a particular mutation, their doctor can use the portal to learn more about it and what drugs may be effective.

To maintain privacy, patient information linked to the genetic information can’t be traced back to the patient. The data includes only basic details useful to scientists, such as the kind of cancer as well as the patient’s age, gender, racial and ethnic background, and when in the course of their treatment the sample was collected.

Although it’s still largely overseen by MSK, the cBioPortal has grown into a collaborative effort. Several other institutions have partnered with MSK and regularly contribute to the software and database. In 2015, the cBioPortal became a major component of the American Association for Cancer Research (AACR) Project GENIE (Genomics Evidence Neoplasia Information Exchange), a joint effort created among several institutions to find better ways to interpret genetic data. Charles Sawyers, Chair of HOPP, leads the project’s steering committee.

Our main goal was to find a way to make the data understandable for users.
Nikolaus Schultz computational biologist

An Important Resource for Scientists Everywhere

At this year’s AACR meeting in April, several members of MSK’s biostatistics and bioinformatics team presented the latest research related to the cBioPortal and Project GENIE. One of the key developers and managers, computational oncologist JianJiong “JJ” Gao, presented many enhancements made over the past year.

Screenshot from webinar - text on a screen

MSK’s cBioPortal team has created a series of educational webinars to teach people how to use the portal’s data.

Early in the COVID-19 pandemic, when many scientists around the world were unable to go to their labs, the cBioPortal team created a series of educational webinars to teach people how to use the data. The sessions were well attended and have had thousands of additional views on YouTube. “These webinars have become a really great resource for people who want to learn all the ways they can use the data in the portal,” Dr. Schultz says.

He credits physician-scientist David Solit, Director of the Marie-Josée and Henry R. Kravis Center for Molecular Oncology, for maximizing the power and impact of MSK’s genetic testing through the cBioPortal. “He’s the one who had the vision for using cBioPortal as a central hub for viewing and analyzing all sequencing data from MSK patients,” he says.

Dr. Schultz and his team are working on new features, including adding data related to imaging, single-cell analysis, and tumor immune profiles. They also are attaching more clinical data to the genetic sequencing information, which will make the data an even more valuable resource. “Our overarching goal is to make the cBioPortal the kind of resource that can enable others to make discoveries,” he said. “That will help advance the field for everyone.”