Professional Development
Data / Health / Nonprofits / Resources

Alex’s Lemonade Stand Foundation is training childhood cancer researchers in data science

Here's why the Childhood Cancer Data Lab, located in Center City, hosts workshops to teach research scientists about machine learning and programming language R.

Mayor Mike Purzycki visits the Pendulum Swing exhibit at The Sold Firm. (Courtesy photo)
A version of this story originally appeared on Generocity.

“We thought, let’s go for something big,” said Jay Scott, co-executive director of Alex’s Lemonade Stand Foundation, “and let’s try to build a center that can use big data to help fight childhood cancer.”

The foundation, which raises funds for pediatric cancer research, established the Childhood Cancer Data Lab (CCDL) in 2017. Based in Center City, the lab’s data scientists, coders and designers create data tools and trainings “to empower childhood cancer researchers to utilize vast amounts of data to make more robust discoveries and cures, faster and cheaper,” per its website.

Its tagline: “Turning researchers who fight childhood cancer into data wizards.”

The CCDL hosted a fully booked data science training workshop from Oct. 14 to 16 covering machine learning, how to process bulk and single-cell RNA sequencing data, and R, a programming language for statistical computing.

Jaclyn Taroni. (Photo by Zari Tarazona)

Jaclyn Taroni, principal data scientist at the CCDL, said there was some progress to be made around training pediatric cancer experts.

“Science works best when we all understand as much of the process as possible,” she said.

The training provides an understanding of skills needed for gene expression analysis. Gene expression data is a kind of measurement, Taroni said: “It’s a molecular snapshot of what’s happening in that tissue. So that could be cells in a dish, that could be a tumor. It could be a liver biopsy.”

The training materials were updated after a pilot training in Philadelphia in July 2018. This year, the CCDL has held trainings in Houston, Chicago and California’s Bay Area.

Robert Allaway, a research scientist at Sage Bionetworks, a nonprofit promoting open science for biomedical research, attended the Houston training in March.

“It was great to see how other people, like other labs, what their best practices are for a lot of this type of work,” Allaway said. “Whether it’s their approaches to doing the analysis or their approaches to tracking experiments.”

CCDL training in Houston. (Courtesy photo)

The people at the training had different levels of coding experience, he added. Although Allaway has some computational biology experience, the CCDL staff helped him analyze a dataset that had unique constraints.

At the training, Allaway used the CCDL’s data tool,, which is a repository of harmonized transcriptome data from publicly available sources. Transcriptome data can be used to identify new potential therapies or try to predict tumor progression, Allaway said.

“Because I’m working in rare disease, there aren’t many datasets out there,” Allaway added. “I was easily able to find everything that I could possibly find, in terms of publicly available data.”

Lindsay Williams, a postdoctoral fellow in the Department of Pediatrics at the University of Minnesota, attended the Houston training as well. Williams, an epidemiologist who specializes in population-based analysis, wasn’t familiar with molecular analysis. The CCDL staff taught her how to find gene expression differences by sex in the data she brought to the training.

“The training was wonderful,” Williams said. “They walk you through absolutely everything that you would need to know, to analyze RNA sequencing data from start to finish.”

Williams’s research interests include pediatric brain tumors, an area that’s received more molecular work than epidemiological. “For me, I kind of approach it as trying to bridge the two fields together,” Williams said.

The CCDL hopes these trainings will give researchers the skills they need to analyze more of their own data themselves and communicate better with analysts.

“If more people have this more basic set of skills that they get from training, they will be less reliant on people who are more specialized,” Taroni said. “The idea being that they will only [seek] help from someone more specialized, as far as computational skills, when they need that.”

The next CCDL training isn’t scheduled yet, but you can sign up to note interest or view training modules on Github here.

Companies: Alex’s Lemonade Stand

Before you go...

Please consider supporting to keep our independent journalism strong. Unlike most business-focused media outlets, we don’t have a paywall. Instead, we count on your personal and organizational support.

Our services Preferred partners The journalism fund

Join our growing Slack community

Join 5,000 tech professionals and entrepreneurs in our community Slack today!


Cal Ripken Jr. essay: The MLB legend explains his drive to build STEM centers in schools across the nation

The end of software as technology

Calling all parents with too much toy clutter: This Philly startup can help

Drexel invests $450,000 in 3 new startups across manufacturing, sustainability and cosmetics

Technically Media