“We thought, let’s go for something big,” said Jay Scott, co-executive director of Alex’s Lemonade Stand Foundation, “and let’s try to build a center that can use big data to help fight childhood cancer.”
The foundation, which raises funds for pediatric cancer research, established the Childhood Cancer Data Lab (CCDL) in 2017. Based in Center City, the lab’s data scientists, coders and designers create data tools and trainings “to empower childhood cancer researchers to utilize vast amounts of data to make more robust discoveries and cures, faster and cheaper,” per its website.
Its tagline: “Turning researchers who fight childhood cancer into data wizards.”
The CCDL hosted a fully booked data science training workshop from Oct. 14 to 16 covering machine learning, how to process bulk and single-cell RNA sequencing data, and R, a programming language for statistical computing.
Jaclyn Taroni, principal data scientist at the CCDL, said there was some progress to be made around training pediatric cancer experts.
“Science works best when we all understand as much of the process as possible,” she said.
The training provides an understanding of skills needed for gene expression analysis. Gene expression data is a kind of measurement, Taroni said: “It’s a molecular snapshot of what’s happening in that tissue. So that could be cells in a dish, that could be a tumor. It could be a liver biopsy.”
The training materials were updated after a pilot training in Philadelphia in July 2018. This year, the CCDL has held trainings in Houston, Chicago and California’s Bay Area.
Robert Allaway, a research scientist at Sage Bionetworks, a nonprofit promoting open science for biomedical research, attended the Houston training in March.
“It was great to see how other people, like other labs, what their best practices are for a lot of this type of work,” Allaway said. “Whether it’s their approaches to doing the analysis or their approaches to tracking experiments.”
The people at the training had different levels of coding experience, he added. Although Allaway has some computational biology experience, the CCDL staff helped him analyze a dataset that had unique constraints.
At the training, Allaway used the CCDL’s data tool, refine.bio, which is a repository of harmonized transcriptome data from publicly available sources. Transcriptome data can be used to identify new potential therapies or try to predict tumor progression, Allaway said.
“Because I’m working in rare disease, there aren’t many datasets out there,” Allaway added. “I was easily able to find everything that I could possibly find, in terms of publicly available data.”
Lindsay Williams, a postdoctoral fellow in the Department of Pediatrics at the University of Minnesota, attended the Houston training as well. Williams, an epidemiologist who specializes in population-based analysis, wasn’t familiar with molecular analysis. The CCDL staff taught her how to find gene expression differences by sex in the data she brought to the training.
“The training was wonderful,” Williams said. “They walk you through absolutely everything that you would need to know, to analyze RNA sequencing data from start to finish.”
Williams’s research interests include pediatric brain tumors, an area that’s received more molecular work than epidemiological. “For me, I kind of approach it as trying to bridge the two fields together,” Williams said.
The CCDL hopes these trainings will give researchers the skills they need to analyze more of their own data themselves and communicate better with analysts.
“If more people have this more basic set of skills that they get from training, they will be less reliant on people who are more specialized,” Taroni said. “The idea being that they will only [seek] help from someone more specialized, as far as computational skills, when they need that.”
The next CCDL training isn’t scheduled yet, but you can sign up to note interest or view training modules on Github here.
Knowledge is power!
Subscribe for free today and stay up to date with news and tips you need to grow your career and connect with our vibrant tech community.