New tool to help analyse 'big data' faster
Scientists have developed a new tool that speeds the analysis of data sets that are so large and complex they would take days or weeks to analyse on a single workstation.
Washington: Scientists have developed a new tool that speeds the analysis of data sets that are so large and complex they would take days or weeks to analyse on a single workstation.
In an age of "big data," a single computer cannot always find the solution a user wants. Computational tasks must instead be distributed across a cluster of computers that analyse a massive data set together, researchers said.
New technologies for monitoring brain activity are generating unprecedented quantities of information. That data may hold new insights into how the brain works - but only if researchers can interpret it.
To help make sense of the data, neuroscientists can now harness the power of distributed computing with Thunder, a library of tools developed at the Howard Hughes Medical Institute's Janelia Research Campus.
Thunder speeds the analysis of data sets that are so large and complex they would take days or weeks to analyse on a single workstation ? if a single workstation could do it at all, researchers said.
Group leaders Jeremy Freeman, Misha Ahrens, and colleagues at Janelia and the University of California, Berkeley, used Thunder to quickly find patterns in high-resolution images collected from the brains of active zebra-fish and mice with multiple imaging techniques.
They have used Thunder to analyse imaging data from a new microscope that Ahrens and colleagues developed to monitor the activity of nearly every individual cell in the brain of a zebrafish as it behaves in response to visual stimuli.
Researchers can find everything they need to begin using the open source library of tools.
New microscopes are capturing images of the brain faster, with better spatial resolution, and across wider regions of the brain than ever before.
Yet all that detail comes encrypted in gigabytes or even terabytes of data. On a single workstation, simple calculations can take hours.
"For a lot of these data sets, a single machine is just not going to cut it," Freeman said.
It's not just the sheer volume of data that exceeds the limits of a single computer, Freeman and Ahrens say, but also its complexity.
"When you record information from the brain, you don't know the best way to get the information that you need out of it. Every data set is different. You have ideas, but whether or not they generate insights is an open question until you actually apply them," said Ahrens.
The research was published in the journal Nature Methods.