When we started the Mouse Encyclopedia Project in 1995, the United States was already well ahead with the large scale Human Genome Project and Japan lagged behind both the United States and the European countries in terms of genomic science research. It was therefore important for Japan to carry out an original project and to develop new technologies in this field.
Since sequencing the whole genome alone does not tell us which parts of the genome that are expressed and function as genes, we started the project of sequencing all RNAs that are expressed as genes. In order to sample as many RNA species as possible, RNA had to be collected from all bodily organs in all stages - from a fertilized egg to a mature cell- since RNA content, unlike DNA content, depends on the cell. However, it is ethically impossible to do that in humans, which is why we chose mice as our target of research. Mice are often used as a model organism for humans.
By combining the results and techniques of the Mouse Encyclopedia Project with those of the Genome Network Project, we believe that we will gain a deeper understanding of the phenomena of life.
RNA in each mouse organ is isolated and converted into complementary DNA (cDNA) with a reverse transcriptase. The isolated RNA is inserted into a circular DNA molecule called a vector, and this vector is transfected into E. coli. The DNAs of the mouse origin increase when the E. coli increase by cultivating. Next, the DNAs are isolated and purified. Subsequently, the mouse DNA is sequenced with automated sequencers. The resulting sequences are assembled to gene sequences by computers. This process must be done quickly, accurately and on a large scale. In addition, with the provided information we build a large-scale mouse gene clone bank.
We have so far sequenced more than 100,000 full-length sequences of mouse genes and also added annotations (e.g. naming, and similarity to known genes). Through this research, we have detected novel facts: that about 70 % of the mouse genome is transcribed to RNAs, that more than half of the RNAs are not finally translated to proteins. We also discovered the new concept that genes which were thought separate before can be fused to become one gene. The results are gathered in a so called clone bank, from which samples are being distributed to researchers all over the world.
By using the same large-scale system for the analysis of genes, genome sequencing of several further organisms has been conducted. The sequencing of human and rice have been carried out in joint collaboration with the former Ministry of Agriculture, Forestry and Fisheries of Japan. Arabidopsis sequencing was done together with the RIKEN Plant Science Center, and sequencing of hyperthermophilic archeon with the RIKEN HARIMA Institute.
Within this project, rapid sequencers, transcriptional sequencing, and a series of full-length cDNA technologies were developed. Further, the DNA book, a novel distribution method for clones, was invented.