Too much data, too few drugs

By David Ewing Duncan, contributor

(Fortune) -- Like sages of old, they came to San Francisco last weekend, a group of biologists and computer scientists setting out to one-up every library ever conceived, from the great one in ancient Alexandria to Wikipedia today.

This library, however, will not consist of vellum scrolls or e-page entries. It aims to compile and make sense of genetic sequences and other raw biological data that are proliferating so fast that biology is about to move from petabytes to exabytes of data -- from quadrillions to quintillions. Just ten years ago, in 2000, all of digitized biology equaled only about 10 gigabytes (giga=billion).

While this is a stunning technological achievement, it also may be contributing to the glut of new drugs coming out of the pharmaceutical industry in recent years. The problem is that too much raw scientific data is scattered across too many databases with too little thought given to organizing it all so that it can be properly mined and used to develop treatments.

Trying to make sense of all this data is what brought two hundred scientists here to the first-ever Sage Congress. Organized by Sage Bionetworks, a new nonprofit based in Seattle, the meeting's attendees have proposed a novel solution: to create a new, open-source model to standardize and link together thousands of databases around the world -- in universities, institutes, governments, and businesses.

"It's time to admit the truth, we're not doing drug development the right way," says Stephen Friend, a co-founder of Sage who until recently headed up Merck's research program in oncology. "75% of cancer drugs don't work."

One day, Sage might allow scientists studying cancer -- or Alzheimer's Disease or diabetes -- to easily access the raw genetic data of thousands of people collected, say, in Ohio, Iceland, and Japan, and connect them to databases detailing cellular mechanisms that may explain how these diseases work.

Sage also wants to build systems that can organize and analyze complex interactions among networks of genes in humans and other organisms. Understanding how these networks react to environmental stimuli -- for instance, an individual's diet and exposure to chemical toxins such as mercury -- is the key to unlocking the secrets of common diseases such as heart disease and diabetes, say scientists.

"We need systems that can mimic the complexity of human biology before we'll really understand how everything works for a disease like diabetes," says Sage co-founder Eric Schadt. A biocomputer scientist, Schadt also recently left Merck (MRK, Fortune 500), where he headed up a team that used super computers and sophisticated tests to study how complex genetic networks and pathways and other molecular entities affect disease.

Creating an über-database is a formidable engineering challenge, but it's not the only barrier. Attitudes also need to change among scientists and institutions used to keeping their data to themselves whenever possible.

"It will require a fundamental change in thinking to realize that sharing data is important," says Friend.

Friend was also a co-founder of Rosetta, a bioinformatics company acquired by Merck in 2001 for $620 million. As part of Merck, Rosetta built one of the fastest supercomputers in the drug industry, running 16 trillion calculations a second. The company also developed specialized chips and computer programs to sequence and analyze tissues throughout the body.

Last year, Merck disbanded Rosetta as part of its downsizing, deciding that building ever more complex models of human biological systems was beyond the resources of a single company. Merck developed several drugs out of the Rosetta project and has agreed to hand over key components of the technology to Sage.

The enormity of the effort led Friend and Schadt to turn to open source technology, which can be run by a small staff while drawing on hundreds, or even thousands, of contributors. Open source has been used with great success in developing software systems like Linux. In science, projects like Science Commons, based at the Massachusetts Institute of Technology, are also working to break down legal, financial and infrastructural barriers to sharing studies and data.

So far, Sage has raised several million dollars from private foundations, companies such as Merck and Pfizer (PFE, Fortune 500), and the National Institutes of Health.

Meanwhile, the petabytes, and soon exabytes, of data keep piling up, adding to the urgency of sorting it all out. Sage will need significantly more funding and a staff large enough to wrestle with and organize a Great Library of this size so that we can start maximizing the potential for understanding biology and developing drugs sooner rather than later.

We may even want to stop producing so much data for a period of time and concentrate on organizing what we've got. To top of page

Frontline troops push for solar energy
The U.S. Marines are testing renewable energy technologies like solar to reduce costs and casualties associated with fossil fuels. Play
25 Best Places to find rich singles
Looking for Mr. or Ms. Moneybags? Hunt down the perfect mate in these wealthy cities, which are brimming with unattached professionals. More
Fun festivals: Twins to mustard to pirates!
You'll see double in Twinsburg, Ohio, and Ketchup lovers should beware in Middleton, WI. Here's some of the best and strangest town festivals. Play
Company Price Change % Change
Apple Inc 100.53 1.37 1.38%
Bank of America Corp... 15.45 0.00 0.00%
Intel Corp 34.34 -0.07 -0.20%
Microsoft Corp 45.33 0.50 1.12%
Facebook Inc 75.29 0.70 0.93%
Data as of Aug 19
Index Last Change % Change
Dow 16,919.59 80.85 0.48%
Nasdaq 4,527.51 19.20 0.43%
S&P 500 1,981.60 9.86 0.50%
Treasuries 2.41 0.02 0.75%
Data as of 5:07am ET


Two months after Dov Charney was fired as CEO, he's still at the company, collecting his full base salary amid early signs of a turnaround. More

Small business owners say the economy is still their biggest challenge, which keeps them from expanding and hiring, according to a CNNMoney-Manta survey. More

Small business owners say the economy is still their biggest challenge, which keeps them from expanding and hiring, according to a CNNMoney-Manta survey. More

When hairdresser Mark Bustos isn't cutting the hair of fashion designers and real estate moguls, he's traveling around the world giving free haircuts to the homeless. More

Market indexes are shown in real time, except for the DJIA, which is delayed by two minutes. All times are ET. Disclaimer Morningstar: © 2014 Morningstar, Inc. All Rights Reserved. Disclaimer The Dow Jones IndexesSM are proprietary to and distributed by Dow Jones & Company, Inc. and have been licensed for use. All content of the Dow Jones IndexesSM © 2014 is proprietary to Dow Jones & Company, Inc. Chicago Mercantile Association. The market data is the property of Chicago Mercantile Exchange Inc. and its licensors. All rights reserved. FactSet Research Systems Inc. 2014. All rights reserved. Most stock quote data provided by BATS.