Scientists Learn to Share Data — A Giant Step Forward in Advancing SCI Research
Scientists in the spinal cord injury (SCI) space are clear on one thing: Every injury and every individual are unique. The same holds true for datasets and scientists. But with data sharing, the possibilities for combining efforts to affect change grow exponentially.
“We know that no single intervention is going to solve spinal cord injury,” says Karim Fouad, Ph.D., co-director and editor of the Open Data Commons (ODC)-SCI and professor and Canada research chair at the University of Alberta. “So, communication between basic researchers, clinicians, and people living with SCI through open data sharing is the only way to really move the field forward.”
It makes sense then that funding agencies and publishers are increasingly demanding that researchers make their datasets accessible through some sort of open data-sharing platform. A transparent data culture not only enables scientists to evaluate and replicate other investigators’ findings, but it also acts as a springboard to develop novel research questions.
Sharing is Caring
Dating back early days of science in India and the U.K., professionals came together to discuss their work daily or weekly, often over whiskey. Then scientists began publishing their findings in journal articles. “They shared exciting results. That’s what we still do in published papers. But the rest of the data gets lost,” says Fouad. “That produces a research bias.”
With open data sharing, the goal is to recover all of that lost data — to publish and share everything across the board. So, when researchers have a hypothesis about a particular treatment, they can go into the ODC-SCI and review experiments that have already been completed — and that helps conserve resources.
“By making all of the data available, investigators can improve how they design trials by taking into account previous research findings,” Fouad says. “Once we have the whole picture, with every detail, we’ll be able to use machine learning algorithms to extract information that our brains can’t even conceptualize yet.”
Capturing that information saves time and money. It also allows an opportunity for independent replication, which is such a central tenant of science. Uploading data to the ODC-SCI makes it a citable data set with a digital objective identifier (DOI) — and that helps researchers comply with funding mandates. That infrastructure already exists within the SCI space. Unfortunately, the idea of entering new and old data is overwhelming for most scientists.
The Challenges of Data Sharing
Data sharing is a daunting concept for many researchers. They’re trained in science, not data entry, and it’s not uncommon for them to feel creatively stifled by the process of entering and uploading data. To complicate matters, many investigators don’t feel like they have the time, knowledge, or bandwidth to take the steps necessary to share their data, particularly “old data.”
To date, the ODC-SCI has effectively checked submitted datasets for compliance with ODC standards. Unfortunately, there’s no comprehensive systematic assistance for recovering data from already published studies, and equally important, studies that were never published.
“Researchers are reluctant to dig out data from the past and reformat it to ODC-SCI, create data dictionaries and complete the required metadata to make the information shareable,” says Marco Baptista, Ph.D., chief scientific officer of the Christopher & Dana Reeve Foundation. Plus, learning to enter data correctly and in a standardized format requires time and training.
There are different levels of understanding about how to format and upload research data into the ODC-SCI — and many scientists don’t feel equipped to share data in a FAIR manner, meaning that it’s Findable, Accessible, Interoperable, and Reusable. And some data might be trickier than others to put into an open data sharing platform.
“Conceptually, researchers are on board, but there’s no personal incentive to enter data, especially old data,” says Marco Baptista, Ph.D., chief scientific officer of the Christopher & Dana Reeve Foundation. “Our goal is to add those incentives while also reducing the risks involved with data sharing.”
Working Toward a Sharing Culture
The complexity inherent in SCI requires myriad treatment approaches used alone and in combination. And while experts in the field agree that no single therapy will cure SCI, sharing data openly and in a standardized way poses big challenges.
To help investigators get up to speed with data sharing, the Christopher & Dana Reeve Foundation and the University of Alberta are partnering with investigators — not just to fund these efforts, but also to provide specialized training and guidance so open data sharing protocols becomes standard operating procedure.
With funds from the Open Data Sharing Grant, the University of Alberta will be hiring a “data retrieval specialist,” who will not only help investigators uncover and enter old data for open data sharing but also train them to capture new data and upload it according to ODC-SCI standards.
“It’s really a two-pronged approach,” explains Baptista. “The first prong relies on the ODC Grant to hire a data retrieval specialist and the second prong is to incentivize researchers to analyze shared data so new hypotheses can emerge.” While the process of integrated data collection, management and sharing can be cumbersome, particularly in the early stages, the effectiveness of this approach is easily measured by how many datasets get moved into the ODC-SCI with a DOI.
“For open data sharing to be effective, we have to change the culture,” Fouad says. “But once scientists begin using old data to come up with new conclusions that move the field forward, the culture will shift. It’s not going to happen overnight, but we’re coming at it from all angles, including incentives, training, and ongoing assistance.”