Papers

Peer-reviewed
2006

Characterization of transcriptional regulatory signals with novel bioinformatics tools

DNA STRUCTURE, CHROMATIN AND GENE EXPRESSION, 2006
  • Takashi Abe
  • ,
  • Hideaki Sugawara
  • ,
  • Shigehiko Kanaya
  • ,
  • Yoko Kosaka
  • ,
  • Toshimichi Ikemura

First page
1
Last page
16
Language
English
Publishing type
Research paper (international conference proceedings)
Publisher
TRANSWORLD RESEARCH NETWORK

Novel tools are needed for comprehensive comparisons of the inter- and intraspecies characteristics of massive amounts of available genomic sequences. An unsupervised neural network algorithm, Kohonen's Self-Organizing Map (SOM), is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We modified the conventional SOM for genome informatics, making the learning process and resulting map independent of the order of data input. We generated SOMs for tri-, tetra-, and pentanucleotide frequencies in 300,000 10-kb sequences derived from 13 eukaryote genomes for which almost complete sequences are available (a total of 3 Gb), using a high-performance supercomputer, the Earth Simulator. SOM recognized species-specific characteristics (key combinations of oligonucleotide frequencies) in most 10-kb sequences, permitting species-specific classification (self-organization) of sequences without any information regarding the species. Because the classification power is very high, SOM is thought to be an efficient andpowerful tool for extracting a wide range of genomic information. SOM was then constructed with oligonucleotide frequencies in 10-kb sequences from 2.8 Gb of human genome sequences, and the SOM identified oligonucleotides with ftequencies characteristically biased from random occurrence level. Furthermore, 1-kb sequences rich in the biased oligonucleotides were self-organized on the map. Because these oligonucleotides often corresponded to functional signal sequences (e.g. binding sites for transcription factors) or their constituent elements, we could categorize occurrence patterns and frequencies of such pentanucleotides in the human genome that are thought to regulate transcription.

Link information
Web of Science
https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=JSTA_CEL&SrcApp=J_Gate_JST&DestLinkType=FullRecord&KeyUT=WOS:000246358200001&DestApp=WOS_CPL
ID information
  • Web of Science ID : WOS:000246358200001

Export
BibTeX RIS