Go Back   AfroChat - African American | Black Discussion Forums > Forum > AfroTechnology > Science

Reply
 
LinkBack Thread Tools Display Modes

 Around the World in 800 Billion Bases
Old January 18th, 2006, 02:17 AM   #1 (permalink)
Gorilla
Afro Resident
 
Gorilla's Avatar
 
Gorilla is offline
Join Date: Jan 2005
Posts: 996
Thanks: 53
Thanked 106 Times in 76 Posts
Gorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud ofGorilla has much to be proud of
Rep Power: 28
Credits: 7,877
Around the World in 800 Billion Bases

Around the World in 800 Billion Bases
Sanger Institute Genetic Records are World's Biggest


On Tuesday 17 January 2006 the Wellcome Trust Sanger Institute's World Trace Archive database of DNA sequences hit one billion entries. The Trace Archive is a store of all the sequence data produced and published by the world scientific community, including the Sanger Institute's own prodigious output as a world-leading genomics institution.

To grasp how much data is in the Archive, if it were printed out as a single line of text, it would stretch around the world more than 250 times. Printing it out on pages of A4 would produce a stack of paper two-and-a-half times as high as Mount Everest.

Each entry is a piece of genetic information averaging 864 characters long. Scientists can search these sequences and piece them together to build up the whole genetic information of organisms - mice, fish, flies, bacteria and, of course, humans.

The Archive is 22 Terabytes in size and doubling every ten months - perhaps the largest single scientific database in Europe, if not the world.

Martin Widlake, Database Services Manager at the Wellcome Trust Sanger Institute said: "At 22 000 GB the Trace Archive is in the Top Ten UNIX databases in the world. That's not bad for a research organisation of 850 employees in the countryside just outside Cambridge."

"It is possibly the biggest single (acknowledged) scientific RDBMS database in Europe, if not the world."

All the data are freely available to the world scientific community (http://trace.ensembl.org/), as a resource to geneticists all over the globe. When a researcher is studying a disease or gene, they can download the genetic information known about the area they are studying.

The data are being actively used by biomedical researchers in academic and commercial organizations. The three internet domains that make most use of the trace archive are .com, .edu and .uk. Dotcoms are responsible for about 80% of download each week - mostly as big 'customers', taking vast chunks each visit. Next are US university researchers, followed by UK scientists.

Trace data are the raw results of genetic research to allow them to identify and study genes, to reveal variations (mutations) in genes and to study similarity to genes in other organisms. These are vital starting points for studying and better understanding the biology of health and disease.

By any comparison, the billion records stands above many other familiar repositories. The British Library holds 13 million items: the US Library of Congress holds 115 million items. The Trace Archive holds one billion chunks of unique information.

"Accessing the data becomes a larger and larger problem as the dataset grows," continued Martin Widlake. "At present it is simple and very quick to access a record if you know its unique identifier as issued by the Sanger Institute, the US National Center for Biotechnology Information (NCBI) database, or the 'name' of the trace as given by the organization that originally sequenced that piece of genetic information."

"Scanning the whole dataset for a single genetic sequence, which is a lot like searching for a single sentence in the contents of the British Library, is a massive task. However, the team at the Sanger Institute are working on new methods to make the data easier to search and access".

The data are held in duplicate, with the NCBI also maintaining a copy: with two sites holding it, a single disaster cannot wipe out the only copy of this vital and heavily used database.

http://www.sanger.ac.uk/Info/Press/2006/060117.shtml
 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Is your church teaching the truth? Yah sent Spirituality 10 July 29th, 2008 12:01 PM
World Oldest Person Dies WellSpoken FrontPage News 4 January 29th, 2007 01:43 PM
The New World Order Weapon_of_Israel Spirituality 6 July 30th, 2005 03:26 PM
Declaration of Rights of the Negro Peoples of the World BBON Black History 1 March 9th, 2005 08:25 PM



All times are GMT -4. The time now is 10:30 PM.


vBulletin skin developed by: eXtremepixels
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.0.0



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46