the lack of pigment in the pericarp of cv Jefferson and cv Nipponbare seeds.
To confirm that the bHLH protein underlying rg7.1is also Rc ,we analyzed a second mutant stock for sequence variation in the bHLH gene.The stock,Surjamkuhi,is an indica line that carries a third allele,Rc-s ,conditioning light red seed pigmentation.This genetic stock offered independent confirmation of the identity of the Rc gene because different sequence polymorphisms in the same gene would be expected to distinguish the Rc-s ,Rc ,and rc alleles.The sequence of the bHLH gene in Surjamkuhi differed from the sequence of the japonica cultivars at many sites (as expected for varieties from different subspecies)but differed from the O.rufipogon allele at only four sites (positions 96,660,1353,and 1833to 1844)(Figure 4).The first two changes proved to be synonymous substitutions.The change at position 1353consisted of a C-to-A change in exon 6.This single-nucleotide polymorphism was independent of any change seen in previous comparisons and represented a premature stop codon before the bHLH domain,truncating the protein and rendering the effect of the remaining indel immaterial.The fact that the different alleles of Rc show sequence polymorphisms that clearly account for the observed phenotypic differences is consistent with the conclusion that the bHLH protein is the Rc gene.
Expression Profiles of Rc and Biosynthetic Genes in White and Red Rice
To examine the timing and localization of the Rc transcript,we used RT-PCR to amplify mRNA from leaf,young panicle (before fertilization),pericarp of young seeds (at the milk or dough stage of grain filling),and pericarp from mature seeds.The mRNA was collected from both cv Jefferson (white seeds)and O.rufipogon (red seeds)plants.RT-PCR showed no expression of Rc in leaf tissue,as expected for a gene associated with a seed pheno-type;however,expression was seen in several stages of panicle development (Figure 3A).Because the promoter of the bHLH gene had been eliminated as the source of polymorphism based on the recombination data,we anticipated that similar expres-sion levels of Rc would be detected in red and white seeds.Our results confirmed this expectation and further demonstrated that the RNA transcript from cv Jefferson contained the 14-bp deletion predicted from the sequence information (Figure 3B).Phylogenetic Comparison
To explore the evolutionary origin of Rc in rice and to identify putative orthologs in other species,we compared the sequence of Rc with other,previously identified bHLH transcription factors involved in anthocyanin and proanthocyanidin regulation as well as proteins with unknown effects that were recovered from BLAST using Rc as the query.The alignable portion of these sequences extended well beyond the bHLH domain (see Sup-plemental Figure 1online),indicating that homology was not restricted to a single conserved functional domain.Analyzed using maximum parsimony or Bayesian analyses,these se-quences fell into several clades (Figure 5).
The divergence between sequences of different clades is substantial,making outgroup selection and the position of the root uncertain.Among clades 1and 2,clades 4and 5,and within clade 3,further alignment was possible,strengthening our find-ings that these groups of sequences are more closely related to each other.Therefore,it is likely that the root lies on one of the branches separating the three main groupings (clades 1and 2,clade 3,and clades 4and 5)from one another.
Several copies of this type of transcription factor appear to have been present in the ancestor of the monocots and eudicots,as clades 1and 2contain both monocot and eudicot sequences (Figure 5).A third copy,present in the common ancestor of maize and rice,gave rise to clade 3,which shows gene duplication within each species.The paralogs within maize are known to confer tissue specificity of the anthocyanin pigmentation (Goff et al.,1992).Clades 4and 5contain only eudicot sequences (Figure 5).It is clear from this analysis that Rc is not closely related to the rice bHLH proteins regulating anthocyanin,because they fall in different clades (Hu et al.,1996;Sakamoto et al.,2001).These rice anthocyanin regulators are sister to the maize anthocyanin regu-lators,and they map to homologous locations on rice chromo-some 4and maize chromosomes 2and 10,respectively.The phylogenetic analysis clusters Rc with In1from maize.Rc and In1are located in homologous chromosomal regions on chromosome 7of both genomes.In1is within maize bin 7.02,and of the 29markers within 7.02that map to the rice genome,14of them hit
a
Figure 4.Coding Sequence Differences between Rc Alleles.
(A)Graphic representation of coding sequence differences in LOC_Os07g11020.1between several pairs of genotypes.The mRNA is represented by rectangles,and the beginning and end of the exons are indicated by vertical lines.Sequence changes are annotated as follows:closed circles,nonsynonymous substitution;lines,synonymous substi-tution;closed triangles,in-frame indel;open triangles,frame-shift indel;point-up triangles,deletion from O.rufipogon or H75;point-down trian-gle,insertion into O.rufipogon .
(B)Table showing polymorphic sites within the coding region of LOC_Os07g11020.1for several different alleles of Rc .Functional nucle-otide polymorphisms are highlighted in gray.
288The Plant Cell