BMI Students

Tuesday, May 02, 2006

Mammalian promoters

Nature Genetics just published a milestone paper from the Fantom/RIKEN consortium compiling an enormous genome-wide collection of transcript start sites (TSSs) in humans and mice. The paper could be a treasure trove for bioinformaticians. They collected TSS tags from many different tissues and mapped them onto the genome. There are several different classes of promoters: some with very well defined TSSs, some with very broad distributions (transcription can start anywhere in a comparatively broad region), some with mutliple well-defined sites and some with combinations of the above. The paper claims four classes. I don't know what kind of clustering they used -- but it would be interesting to know more about how distinct their classes are and if four is really the best estimate.

I think that in addition to analyses they did in the paper, one can try a bunch of correlations quickly -- several possibilities for projects small and large. Like, do promoter classes correlate with alternatively spliced genes? or are TSS'es correlated with transcription units from tiled array experiments (Affy and others)? One can also do some gene ontology correlations, or expression analysis using these data. We know that transcription initiation, splicing and expression (and other things) are all intimately connected, so this might be leveragable in many different directions...


Post a Comment

<< Home