Domestication Syndrome in Cassava ( Manihot esculenta Crantz): Assessing Morphological Traits and Differentially Expressed Genes Associated with Genetic Diversity of Storage Root Domestication Syndrome in Cassava ( Manihot esculenta Crantz): Assessing Morphological Traits and Differentially Expressed Genes Associated with Genetic Diversity of Storage Root

Cassava ( Manihot esculenta Crantz) provides a staple food source for millions of people in tropical and subtropical world regions. Brazil is the major center of diversification for species of the Manihot , and a center for domestication of the cultivated species origi nated from wild ancestral M. esculenta subsp. flabellifolia. Genetic breeding of cassava depends on landraces. Molecular phylogenetic technologies used to study genetic traits selected by mankind in crops, are likely to predict proposed “domestication syndrome.” Phylogenetic trees use DNA sequences alignment to infer on gene historical events. A study on regulatory and structural complexity that dictates gene/protein function, will add non-sequence information to predict a more complete understanding of functional evolution. Transcriptional profile contains critical information on when and where a gene is manifested. These regulatory properties could explain functional genes diver - sity achieved within gene families across closely related species such as cassava and its ancestor. Microarray technologies measure transcriptional response of gene to a given environmental or genetic factor. Integration of genomic and transcriptomic data provides more detailed picture of molecular evolution. This chapter describes comprehensive study using the wild relative of cassava ancestor, recognition of natural morphological trait changes during domestication, and gene expression of cassava storage root.


Introduction
Although obtaining space for the intentional cultivation of edible plants often starts with the clearing of forests and modification of landscapes, it was the ability to domesticate plants that made agriculture possible in the first place. Domestication consists of a set of consecutive stages that begins with the original set of plant traits and evolves through the increase in selection frequency for desirable traits (the domestication traits). In the genus Manihot, the geographical occurrence and species relationships provided the first source of candidate species to be domesticated, which culminated in the emergence of cultivated plants adapted to both human needs and a cultivated environment that may be directly associated to the type of crop (Table 1). A subset of traits that collectively form the morphological and physiological differences between the cultivated plant and their wild progenitors (the domestication syndrome) is specific for a plant adapted to human needs. For the case of cassava (Manihot esculenta subsp. esculenta), we selected a set of observable traits which suffered intense human-driven selection in relation to its ancestor (M. esculenta subsp. flabellifolia). These traits include early stages of environmental change resulting from the transition from forest (shade) to open-field cultivation. To determine what regulatory genes differ during the change in growth habit traits, such as vine-type (ancestor) to shrub-type (cultivated) growth, thickening of fibrous root (ancestor) into storage root (domesticated) and flowering set reduction as domestication progressed. The present chapter contemplate, the actual knowledge on the issue of cassava domestication report our current and forward studies on the evolutionary suite [1,2] of genetic diversity in cassava landraces using genomic, transcriptomic, and proteomic technologies for cassava storage root, as the major domestication trait in cassava. Based on gene expression analysis, we identified a set of exploratory regulatory gene networks associated with diversity among minimum mankind artificial interference on the domestication of the major domestication trait (storage root formation) in cassava crop in relation to their ancestor originated in Brazilian Amazon (a major center for domestication of cassava). Finally, the pattern of exploratory regulatory gene networks linked to genotypes diversity was used to predict the early steps in the domestication process of

Cassava wild relatives and the ancestral species
Mexico and Brazil are considered two relevant centers of diversification for Manihot species [3][4][5]. In Brazil, the geographical distribution for the genus Manihot is presented in Figure 1. Current systematic working models for botanical descriptions and classifications of species in the genus Manihot [6][7][8][9] contrasts in two major areas compared with an earlier classical monograph [9]. The early classical monograph suggests a total number of 98 species for the genus Manihot, which adopts a section classification system for grouping morphological closeness of species [9,10] and proposes a compiled species concept to explain the origin of a single cultivated species (M. esculenta subsp. esculenta). The current model avoids the section classification system, reduces the number of species by using synonymies, uses an evolutionary approach, and permits a single ancestor species (M. esculenta subsp. flabellifolia) to explain the origin of the cultivated species (M. esculenta subsp. esculenta) of the genus Manihot [5][6][7][8]. In this chapter, we first contrast these two approaches by using a phylogenetic analysis of ribosomal RNA internal transcribed spacer (ITS) for 17 Manihot species from Brazil and Mexico to identify those most closely related to M. esculenta subsp. esculenta (cassava), which is recognized as the only cultivated species of the genus.   [3,4]. These analyses indicated that cassava probably did not originate from Mexico. Therefore, it is not the result of a compiled species but instead possibly has a Brazilian single species ancestor originally named M. esculenta subsp. flabellifolia [5][6][7]. A gene pool analysis [8] for the cultivated species (M. esculenta subsp. esculenta) and its wild relative identify two gene pools involving 13 Manihot species in gene pool 2 (GP2) and 4 Manihot species in gene pool 1 (GP1), which has M. esculenta subsp. flabellifolia based on overall similarity of several morphological characteristics indicates M. esculenta subsp flabellifolia as the closest alias of cassava [8].

Domestication as evolutionary processes
Phylogenetic techniques used for determining the molecular evolution of a crop have relied predominantly on sequence information to model the evolutionary history that determines plant speciation and domestication. Phylogenetic trees are based on alignment of DNA or protein sequences, from which evolutionary distances between genes can be inferred. However, transcriptional behavior of a gene is poorly represented by DNA sequence data alone. A gene's transcriptional profile may contain critical functions, including when and where a gene is expressed, and the conditions under which gene expression is manifested. This chapter addresses questions on how function transcriptional profiles vary due to changes in the environment (light intensity) and due to genetic diversity of landraces and commercial breed varieties.

Molecular evolution of a crop species
Factors involved in regulation of expressed genes or gene sets could be crucial in explaining the key functional differences between related genes whose function, during selection (natural and artificial), cannot be distinguished from DNA sequence alone [11][12][13][14][15][16][17][18][19]. Attempts to predict expression patterns of genes using sequence information [20] have typically been limited by the complexity and diversity of factors influencing genes. Thus, sequence-based prediction of a gene's regulation remains a premature goal. However, transcriptomic approaches, for example, using microarray chip or RNA-Seq technology, allow for a direct, quantitative measurement of global transcriptional responses to a given environmental or genetic factor and are useful experimental sources for obtaining large-scale gene expression data [21,22]. Genomic data sets spanning a wide selection of the cassava ancestor (M. esculenta subsp. flabellifolia), landraces from the Amazon, and breeding cultivars (cv.) are publicly available [23][24][25][26][27][28][29][30][31][32][33], providing a ready source of data for studying several aspects of gene transcription behavior. The integration of genome and transcriptome data [34][35][36][37][38][39][40][41][42][43][44] provides an increasingly detailed picture of molecular evolution by incorporating regulatory behavior into models of the evolution of gene expression and function. Here, we report steps toward understanding changes in gene expression to model gene evolutionary function using our cassava domestication syndrome hypothesis. Specifically, the cassava domestication syndrome hypothesis considers changes in traits such as plant growth habit, storage root formation and flowering from the cassava ancestor (M. esculenta subsp. flabellifolia) to becoming the cultivated species (M. esculenta subsp. esculenta). Figure 2 illustrates these variables as observed from field trips to the Amazon and recorded images.

Differentially expressed genes
A cDNA microarray chip designed for Euphorbiaceae [24] was probed with total RNA extracted from storage root (31 samples total) of cassava with diverse storage root traits. This chapter documents a total of 569 genes which were identified as differentially expressed (p-value of 0.005) between storage root of M. esculenta subsp. flabellifolia (ancestor) and various M. esculenta subsp. esculenta landraces and cultivars. Hybridization intensity values were statistically analyzed to further identify groups among the differentially expressed genes (DEG). The complexity of the experimental design and analyses for screening groups of DEG was achieved using two statistical strategies [34,35]. First, principal component analysis (PCA) was performed to observe the number of DEG among each group. The PCA results identified four groups among the DEG. Considering the cassava domestication syndrome hypothesis described above, the questions to address with the available data are (i) do these results occur due to differences in expression of genes per se or, (ii) in part, due to the selected genetic backgrounds in the experimental design? Therefore, the second approach used recursive partitioning to obtain tentative conclusions about the grouping patterns [34,35], as shown in Figure 3.

Ontology and functional classification of differentially expressed genes (DEG)
Analysis of the DEG identified 22 distinct groups among gene ontologies and functional classifications. The groups (Figure 4) highlighted in yellow (i.e., "Protein with Binding Function  or Cofactor Requirement," "Regulation of Metabolism and Protein Function," and "Cellular Transport, Transport Facilities and Transport Routes") were targeted to elucidate candidate genes involved in regulation of these key pathway networks.

Exploratory pathway networks and candidate regulatory genes
The program Pathway Studio [43] was used to conduct subanalysis (SNEA) to identify potential regulatory networks from transcriptome data obtained in this study and available databases, as previously described [36][37][38][39][40][41][42]. The results on statistics (shown in Tables 2 and 3) and visualization of gene networks (shown in Figures 5 and 6) took into consideration three types of molecular interaction mechanisms (expression target, protein binding, and protein modification).
These results indicate node operating gene/hub, edge genes which are regulated (activated or silenced), and their expression level-increased abundance (blue color) or decreased abundance (pink color) among genes visualized in the pathways; regulatory genes such as transcription factors and other gene products modulating functionality (protein binding and modification) were observed. The node/hub gene regulates the network and genes, while on the edge are regulatory genes of a particular network. Table 4 summarizes the list of nodes/hubs in the networks unique to each class of landraces based on comparisons to the cassava ancestor and the cv. IAC 12-829.   gene groups on the identification of functional gene according to the Sub-Network Enrichment Analysis (SNEA) for biological processes, cellular components, and molecular functions using based on annotation of cassava genes to Arabidopsis database. The gene sets were grouped per their function and selected for retrieving and visualizing regulatory networks or pathways they form. Together these results add new knowledge about the potential functionality of gene products previously unknown in cassava storage root, their potential roles in the domestication trait, as well as in the flowering set trait. As an example of these analyses, we propose a hypothetical hormonal gene regulatory model (Figure 7) to represent the effect of environmental light changes likely caused due to removal of ancestral cassava from the forest as illustrated in Figure 2. However, it is important to clarify that Gene Set Enrichment Analyses (GSEA) and Sub-Network Enrichment Analysis (SNEA) is based on annotation of cassava genes to Arabidopsis. Thus, for our proposed models, we are assuming that the cassava gene products are performing similar functions as their Arabidopsis homologues (all Arabidopsis annotations used in this chapter, and their known functions can be obtained from [44]).    Table 4. Abbreviation (in parenthesis) accounts for molecular interaction mechanism as expression target (ET), protein interaction (PI), and protein binding (PB) regulatory gene function. The removal of ancestral cassava from a shaded forest environment would be expected to alter regulatory networks and pathways involved in light perception and signaling, as highlighted in Figure 5 (Panel C) and Figure 6 (Panel A) (the Brazilian collection). Further, altering light quantity and quality or selection for storage root traits during the domestication process also appears to have differentially impacted gibberellic acid (GA) signaling regulation of DELLAs, as indicated by Figure 5 (Panel C) (GAI) and Figure 6 (Panel B) (SLY1, RGA1, RGL2, GAI). These alterations are likely to have impacted known regulatory networks involving interactions between DELLAs and PIF3/PIF4 [39,40]. Because light perception (lack of shade) would be expected to reduce the positive impact that GA signaling has on inhibiting DELLA, the function of DELLAs in reduced GA-induced elongation likely resulted in dwarfed and bushy phenotypes [41,42]. In aboveground photosynthesizing tissues, shifts in expression of genes linked to the GA/DELLA regulatory pathway would also be expected to result in shifts between skotomorphogenesis and photomorphogenesis [39] and, potentially, reduced flowering as illustrated in Figures 2 and 7. However, in the underground storage root of cassava, altered regulation of DELLAs may have played some role in the shift observed from a fibrous type to a storage root type, as also illustrated in Figures 2 and 7. DELLAs have been reported to impact auxin signaling pathways, and, possibly, DELLA's impact on auxin via jasmonic acid (JA) regulation involving JAZ1 and MYC2, as reviewed by [42], could be involved in this process (Figure 7). Further, DELLA and SCARECROW (SCR, see Figure 5 (Panel A-World Collection) have known interactions that could also be involved in the domestication trait of cassava storage root formation by impacting GA/auxin/cytokinin cross-talk and signaling and root meristematic development and differentiation [44].

Gene abbreviation
Gene name ATG number Function (from www.arabidopsis.org)

14-3-3
14-3-3 family protein 14-3-3 proteins are a family of conserved regulatory molecules that are expressed in all eukaryotic cells. 14-3-3 proteins have the ability to bind a multitude of functionally diverse signaling proteins, including kinases, phosphatases, and transmembrane receptors. More than 200 signaling proteins have been reported as 14-3-3 ligands (with node/hubs SLY1, SCF, TIR1, and ubiquitin) also appear to function through proteasome degradation pathways. Collectively, these examples suggest that [39,40] the domestication syndrome may have evolved through a combination of changes in [39,40] environmental signaling factors and selection pressures that impacted phytohormone cross-talk, signaling, and protein regulation pathways during the removal of the cassava ancestor from the forest, which, in turn, lead to modern-day domesticated landraces and cultivars of cassava.

Synthesis and conclusions
This study highlighted some key factors influencing the fate of gene function in relation to cassava domestication syndrome traits including (1) landscape alterations resulting in sunlight exposure (alteration of light quality and quantity) in early stages of the domestication process of cassava crop due to removal of forest; (2) (4) artificial selective pressure (man involuntarily selecting plant traits such as plants with low flowering set in relation to the ancestor). Additionally, the data presented in the present study also allowed, for the first time, to propose a hormonal regulating model (Figure 7) on the involvement of GA/DELLA and Auxin/Jasmonate in cassava domestication traits. It appears that domestication of cassava originated with the removal of ancestral M. esculenta subsp. flabellifolia from the Amazon forest, which, through human trait preference and selection, evolved into current-day cultivated landraces and cultivars of M. esculenta due. As a result, ancestral cassava evolved from a vine, prolific flowering, and fibrous root phenotype into a domesticated bushy, reduced flowering, and tuberous storage root phenotype (Figures 2 and 7). Selection for specific storage root traits also resulted in domesticated cassava storage root with color diversity [46], storage root phenotypes, and carbon sequestration diversity with starch vs. sugary storage root phenotypes [47][48][49].
As our understanding of gene expression evolution improves, it should become possible to infer protein function into approaches focused on the use of proteomics technologies [50]. Ancestral protein functions can be estimated using this approach, and efforts to annotate current genes/proteins will benefit from knowledge of the behavior and factors influencing gene expression profiles. Ultimately, gene expression profiles should be equally integrated with structure and sequence to predict and assist in annotating protein function and evolution directly on the genome sequence of the ancestor and cultivated species.
Technological advances have aided our ability to rapidly and affordably obtain and compare transcriptomes from within and across plant species. As presented in this chapter, we compare the global transcriptomes from storage root of various landraces and cultivars of domesticated cassava and ancestral M. esculenta subsp. flabellifolia to identify differentially abundance of transcripts, in this case, at a very stringent level (p < 0.005). Modern technology also affords advances in bioinformatics approaches for analyzing these large transcriptomic data set using ever-evolving algorithms. In this chapter, we used GSEA and SNEA to identify nodes/hubs and regulatory genes/proteins that provide a snapshot of potential regulatory networks/pathways that differ between ancestral and domesticated cassava storage root and are likely key pathways involved in the domestication syndrome. Based on our results, it appears that as ancestral cassava evolved into domesticated cassava several important regulatory pathways involved in light signaling and regulation, floral signaling and regulation, and hormone signaling and regulation (particularly GA and auxin) were altered. Many of these processes also appear to involve complexes involved in regulating protein degradation through Ubiquitination and Proteasomal trafficking.