Department of Plant Breeding, Wageningen University, 6708 PB Wageningen, THE NETHERLANDS.
Seed is the basic and most critical input for seed propagated agricultural crops: seed quality and seedling vigour determine plant establishment, growth and development in both natural and agricultural ecosystems. Seed quality and seedling vigour are mainly determined by the interactions of the following three components: genetic background, physiological quality and the environmental conditions during seed set, seed ripening, storage, seed germination and early seedling development. In the past, many efforts have been made to improve seed germination and seedling vigour by optimizing physiological and environmental factors (non-genetic factors); however, the paradigm has shifted to investigate genetic factors and to use these to improve crop performance by plant breeding. The aim of this thesis is to unravel the genetics of seed germination and seedling vigour under different conditions in Brassica rapa, using a systems genetics approach. Studies in many crop species have reported that seed germination and seedling vigour traits are governed by many genes and are strongly affected by environmental conditions. As salinity stress is becoming one of the most important abiotic stresses affecting crop growth and yield, we studied the genetics of seed germination and seedling vigour under neutral and salt stress conditions. For a number of crops, it has been established that larger seed size and higher seed weight indicate more reserve food and contribute positively to seedling establishment. Therefore, our hypothesis for this thesis is that transcriptional regulation of genes during seed development determines the composition and content of seed reserves, and that these seed reserves play a major role in seed germination and seedling growth, especially at the heterotrophic stage under optimal and sub-optimal conditions.
B. rapa is an extremely diverse Brassica species which includes, besides many diverse leafy vegetable types and turnips, also oilseed crops. Brassica seeds are of high economic importance for several reasons. They are the starting point of the life cycle of the crop, but also they are directly used as sources of vegetable oil or condiments. At present, B. napus is the most important source of vegetable oil worldwide, but B. rapa is often used for introgression breeding to broaden its narrow genetic base resulting in genetic improvements. Therefore, the acquired knowledge is also useful for the scientific community and plant breeders working in B. napus and other Brassica species.
In Chapter 2 we evaluated the genetic diversity of a B. rapa core collection of 168 accessions representing different crop types and geographic origins. Using the Bayesian cluster analysis software STRUCTURE, we identified four subpopulations: subpopulation 1 with accessions of Indian origin, spring oil, yellow sarson and rapid cycling; subpopulation 2 consisting of several types from Asian origins: pak choi, winter oil, mizuna, mibuna, komasuna, turnip green, oil rape and Asian turnip; subpopulation 3, which included mainly accessions of Chinese cabbage and subpopulation 4 with mostly vegetable turnip, fodder turnip and brocoletto accessions from European origin. The geographical distribution of the accessions was very much congruent with genetic, metabolic and morphological diversity. This initial study was followed by association studies for secondary metabolites from the tocopherol and carotenoids pathways, using the population structure of these four subpopulations as a correction term to control for spurious marker-trait associations (Chapter 2). Additionally, we used a machine learning approach, Random Forest (RF) regression, to find marker-trait associations. We chose the RF approach as it can handle large numbers of variables (markers, metabolites, transcript abundance) in combination with relatively small sample sets of accessions, to show its perspectives for application to the increasing amounts of data available through the different ~omics technologies. In our analysis, the markers showing significant association with metabolites identified by the RF approach overlapped with markers obtained from association mapping. Those markers could potentially be used for marker-assisted selection (MAS) in breeding for these secondary metabolites in different morphotypes or sub-populations. Knowledge of genetic distance as evaluated in this chapter allowed the choice of parents to create a segregating population for QTL analyses by maximizing genetic variation between the parents.
In Chapter 4, a doubled haploid (DH) population from a cross of genetically diverse morphotypes of B. rapa, an oil-type yellow sarson (YS143) and a vegetable pak choi (PC175) (Chapter 2), was used to evaluate the genetic basis of seed germination and seedling vigour traits under both nonstress and salt stress conditions. The yellow sarson parent had larger seed size and higher thousand-seed weight than the pak choi parent, and displayed earlier onset, higher uniformity in germination, faster germination and maximum germination, and higher root- and shoot- lengths and biomass under both non-stress and salt stress conditions. Positive correlations of thousandseed weight with earliness, speed and uniformity of germination and maximum germination percentage, supports that larger seeds germinate earlier, faster, more uniformly and to a higher maximum germination percentage than smaller seeds. Thus, we conclude that yellow sarson had higher seed quality and seedling vigour than pak choi. However yellow sarson also contributed negative alleles to seed germination, as illustrated by its allele of the QTL at A05 which decreases the uniformity of seed germination. In addition we also observed that yellow sarson seedling growth was more affected by salt stress than pak choi. All traits were scored over the DH population, and this clearly showed transgressive variation for most traits. Eight QTL hotspots were identified for seed weight, seed germination, and root and shoot lengths. A QTL hotspot for seed germination on A02 co-located with a homologue of the FLOWERING LOCUS C (BrFLC2) genes and its cis-acting expression QTL (cis-eQTL). FLC2 (BrFLC2 in B. rapa) is an important repressor of flowering time in both A. thaliana and B. rapa and recently, FLC2 was reported for its pleiotropic effect on seed germination in A. thaliana. A QTL hotspot on A05 with salt stress specific QTL colocated with the FATTY ACID DESATURASE 2 (BrFAD2) gene and its cis-eQTL. Besides the role of FAD2 in fatty acid desaturation, the up-regulation of this gene was associated with enhanced seed germination and hypocotyl elongation under salinity in B. napus (BnFAD2) and A. thaliana (FAD2). We observed epistatic interactions between the QTL hotspots at the BrFLC2 and BrFAD2 loci, and between other QTL hotspots.
Seed development is regulated by many dynamic metabolic processes controlled by complex networks of spatially and temporally expressed genes. Therefore, morphological characteristics and the transcriptional signatures of developing seeds from yellow- and brown/black-seeded genotypes were studied to get to know the timing of key metabolic processes, to explore the major transcriptional differences and to identify the optimum stage for a genetical genomics study for B. rapa seed traits (Chapter 3). This is the first study of genome-wide profiling of transcript abundance during seed development in B. rapa. Most transcriptional changes occurred between 25 and 35 days after pollination (between the bent-cotyledon stage and the stage when the embryo fully fills the seed), which is later than in the related species B. napus. A weighted gene coexpression network analysis (WGCNA) identified 47 gene modules with different co-expression patterns, of which 17 showed a genotype effect, 4 modules a time effect during seed development and 6 modules both genotype and time effects. Based on the number of genes in gene modules, the predominant variation in gene expression was according to developmental stages rather than morphotype differences. We identified 17 putative cis-regulatory elements (motifs) for four coregulated gene clusters of genes related to lipid metabolism. The identification of key physiological events, major expression patterns, and putative cis-regulatory elements provides useful information to construct gene regulatory networks in B. rapa developing seeds and provides a starting point for a genetical genomics study of fatty acid composition and additional seed traits in Chapter 5.
Since Brassica seeds are sources of vegetable oil, genetic studies of the gene regulatory mechanisms underlying lipid metabolism is of high importance, not only in relation to seed and seedling vigour, but also for Brassica oilseed breeding. In Chapter 5, an integrative approach of QTL mapping for fatty acids composition and for transcript abundance (eQTL) of genes related to lipid metabolism, together with gene co-expression networks was used to unravel the genetic regulation of seed fatty acid composition in the DH population of B. rapa. In this study, a confounding effect of flowering time variation was observed on fatty acid QTLs (metabolite level) at linkage group A02 and of seed colour variation on eQTLs (transcript level) at linkage group A09. At A02, fatty acid QTLs from 2009 seeds co-locate with the genetic position of a gene-targeted marker for BrFLC2, its cis-QTL, and a major flowering time QTL. Flowering time variation is very obvious in this DH population and the BrFLC2 gene at A02 (16.7 cM) is the major regulator of flowering time, with a non-functional allele in the yellow sarson parent. When QTL analysis was performed on seeds from 2011, from DH lines that flowered synchronously due to staggered sowing, this fatty acid QTL hotspot disappeared. The 2011 seed lot was used for further analysis combining fatty acid QTLs with eQTLs in this study. On A09, a large trans-eQTL hotspot was colocalized with a major seed colour QTL, in the region where the causal gene, the bHLH transcription factor BrTT8, was cloned. The role of this gene in seed colour development was functionally proven in B. rapa. As the yellow sarson and pak choi parents of this population have contrasting seed coat colour (Chapter 3) the DH lines segregated for seed colour. When seed colour variation was used as a co-variate in our statistical model, we could exclude its confounding effect on eQTL mapping. We compared the fatty acid QTL and eQTL results from the analyses before and after seed colour correction and later discuss the results from the analysis after correction. The distribution of major QTLs for fatty acids showed a relationship with the types of fatty acids: linkage group A03 contained major QTLs for monounsaturated fatty acids (MUFAs), A04 for saturated fatty acids (SFAs) and A05 for polyunsaturated fatty acids (PUFAs). Using a genetical genomics approach, eQTL hotspots were found at major fatty acid QTLs on A03, A04 and A05 and on A09. Finally, an eQTL-guided gene co-expression network of lipid metabolism related genes showed major hubs at the genes BrPLA2-ALPHA, BrWD-40, a number of seed storage protein genes and a transcription factor BrMD-2, suggesting essential roles of these genes in lipid metabolism. Several genes, such as BrFAE1, BrTAG1, BrFAD2, BrFAD5, BrFAD7, which were reported as important genes for fatty acid composition in seeds in other studies of related species, had relatively lower degrees of connection in the networks. However their cis-eQTLs co-localized with specific fatty acid QTLs, making them candidate genes for the observed variation. We hypothesize that these play a role in modifying fatty acid content or composition across genotypes, rather than playing essential roles in the pathway itself. These results suggest the need of a global study of lipid metabolism rather than a strict focus on the fatty acid biosynthesis pathway per se. This study gives a starting point for understanding the genetic regulation of lipid metabolism, by identification of a number of key regulatory genes, identified as major hub genes, and candidate genes for fatty acid QTLs.
In the final chapter (Chapter 6) we summarize and critically discuss the relationships among phenotypic traits, metabolites and expression variation as well as the co-localization of QTLs from these different levels. In this thesis, we developed methodology to integrate transcriptomics and metabolomics data sets and to construct gene regulatory networks related to major fatty acids, and found a set of (possible) candidate genes involved in lipid metabolism. In the future, we recommend to integrate the genome-wide transcriptome data set with all major seed metabolites and phenotypic data on seed and seedling vigour to directly link all three components: transcriptome, metabolome and phenotypic traits, and ultimately expand the knowledge on the genetic regulation of seed metabolites, seed quality and seedling vigour in B. rapa to other Brassica species.