Mammalian evolution has seen a rapid proliferation of various types of non-coding RNAs, including members of both spliceosomal RNAs and small nucleolar RNAs. The significance of this remarkable expansion however remains obscure. While ncRNA copy number expansions have been linked in some instances to functionally tractable effects, such as increased regulatory complexity driven by microRNAs, other examples suggest that such events may equally likely be neutral as a result of random retrotransposition. Hindering progress in our understanding of such observations is a difficulty in establishing function for the diverse features that have been identified in our own genome. High-throughput data generation projects such as ENCODE have revealed a hidden world of genomic expression patterns, as well as a host of other potential indicators of function such as DNA methylation patterns, chromatin state and long range genomic interactions. However, ENCODE and similar projects have been criticised, particularly from practitioners in the field of molecular evolution, where many suspect such data provide limited insight into biological function. The molecular evolution community has summarized how past work informs a sceptical view, but it is important to establish tests of function. We use a range of data, including data drawn from ENCODE and FANTOM, to examine the case for function for the recent copy number expansion in mammals of six evolutionarily ancient RNA families involved in splicing (snRNAs U1, U2, U4, U5, U6) and rRNA maturation (snoRNA U3). We use several criteria to assess evidence for function: conservation of sequence and structure, genomic synteny, evidence for transposition, and evidence for species-specific expression. Using these criteria, we find that only a minority of loci show strong evidence for function and that for the majority of loci we cannot reject the null hypothesis of no function.
Find the full open access article here.