Google and Broad Institute Team Up to Bring Genomic Analysis to the Cloud

Save ArticleSave Article

Failed to save article

Please try again

This article is more than 7 years old.
A single strand of DNA.  (Stuart Caie / Flickr )

Google has teamed up with one of the world's top genomics centers, the Broad Institute of MIT and Harvard, to work on a series of projects it claims will propel biomedical research.

For the first joint project, engineers from both organizations will bring "GATK," the Broad Institute's widely-used genome analysis toolkit, onto Google's cloud service and into the hands of researchers.

"The limiting factor is no longer getting the DNA sequenced," said Dr. Barry Starr, a Stanford geneticist and a contributor to KQED. "It is now interpreting all of that information in a meaningful way."

The Broad Institute alone analyzed a massive 200 terabytes of raw data in a single month. In the past decade, the institute has genotyped more than 1.4 million biological samples.

Google isn't the only tech company vying to use cloud-based technology to store and analyze this massive volume of genetic information. This is a point of competition between Google, IBM, Amazon, and Microsoft.


But Google is now the only public cloud provider to offer the GATK toolkit as a service.  By making  the software available in the cloud, researchers can run it on large data-sets without access to local computing  -- and that frees up both time and resources.

"GATK was already available to researchers and tens of thousands have used the software to analyze their data," said Starr. "Google adds the power of being able to handle much more data at a time."

Google Genomics' product manager Jonathan Bingham told KQED  two groups will benefit most from this partnership: small research groups who lack sophisticated computing, and any individual who wants to analyze large genomic data sets without needing to download them. 

“Broad Institute has got a tremendous amount of expertise working with large numbers of biological samples and huge volumes of genomic data," Bingham explained. "Meanwhile, Google has built the infrastructure and tools to process and analyze the data and keep it secure.”

The toolkit will be available for free to nonprofits and academics. Businesses will need to pay to license it from the Broad Institute.

Some genetics experts say this announcement is evidence that the health industry is increasingly willing to embrace cloud computing. In the past, health organizations have been hesitant due to concerns about compliance and security.

"This suggests that the genomics industry has moved beyond the cloud debate," said Jonathan Hirsch, president and co-founder of Syapse, a Silicon Valley-based company that wants to bring more genomics data into routine clinical use.

"It is OK for researchers and clinicians to do genomics work in the cloud, and trust that cloud provider's hardware and software."

In the future, Bingham said there may be opportunities to work on projects to further our genetic understanding of cancer and diabetes.

But for now, he said, the organizations are focused on "general purpose" tools that aren't specific to a disease and can be used by researchers everywhere.