Academia.eduAcademia.edu
1 Title: 2 Genome of the Rio Pearlfish (Nematolebias whitei), a bi-annual killifish model for Eco-Evo-Devo 3 in extreme environments 5 Authors: 6 Andrew W. Thompson1,2*, Harrison Wojtas1, Myles Davoll1,3, and Ingo Braasch1,2* 7 8 Affiliations: 9 1 10 2 11 48824, USA. 12 3 Department of Integrative Biology, Michigan State University, East Lansing, MI, 48824, USA. Ecology, Evolution & Behavior (EEB) Program, Michigan State University, East Lansing, MI, Department of Biology, University of Virginia, Charlottesville, VA, 22903, USA. 13 14 *Authors for Correspondence: Andrew W. Thompson, Department of Integrative Biology, 15 Michigan State University, East Lansing, MI, 48824, USA. thom1524@msu.edu 16 Ingo Braasch, Department of Integrative Biology, Michigan State University, East Lansing, MI, 17 48824, USA. braasch@msu.edu 18 19 Keywords: 20 Nematolebias whitei, Rio Pearlfish, diapause, aging, hatching, extreme environments, Eco-Evo- 21 Devo, teleost 22 23 24 25 26 Running Title: Genome of the Rio Pearlfish © The Author(s) (2022) . Published by Oxford University Press on behalf of the Genetics Society of America. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 1 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 4 27 28 Abstract: The Rio Pearlfish, Nematolebias whitei, is a bi-annual killifish species inhabiting seasonal pools in the Rio de Janeiro region of Brazil that dry twice per year. Embryos enter 30 dormant diapause stages in the soil, waiting for the inundation of the habitat which triggers 31 hatching and commencement of a new life cycle. Rio Pearlfish represents a convergent, 32 independent origin of annualism from other emerging killifish model species. While some 33 transcriptomic datasets are available for Rio Pearlfish, thus far, a sequenced genome has been 34 unavailable. Here we present a high quality, 1.2Gb chromosome-level genome assembly, 35 genome annotations, and a comparative genomic investigation of the Rio Pearlfish as 36 representative of a vertebrate clade that evolved environmentally-cued hatching. We show 37 conservation of 3-D genome structure across teleost fish evolution, developmental stages, 38 tissues and cell types. Our analysis of mobile DNA shows that Rio Pearlfish, like other annual 39 killifishes, possesses an expanded transposable element profile with implications for rapid aging 40 and adaptation to harsh conditions. We use the Rio Pearlfish genome to identify its hatching 41 enzyme gene repertoire and the location of the hatching gland, a key first step in understanding 42 the developmental genetic control of hatching. The Rio Pearlfish genome expands the 43 comparative genomic toolkit available to study convergent origins of seasonal life histories, 44 diapause, and rapid aging phenotypes. We present the first set of genomic resources for this 45 emerging model organism, critical for future functional genetic and multi-omic explorations of 46 “Eco-Evo-Devo” phenotypes of resilience and adaptation to extreme environments. 47 48 49 50 51 52 2 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 29 53 54 Introduction: Aplocheiloid killifishes inhabit tropical freshwater habitats around the world. Some African and Neotropical species live in ephemeral waters that are subject to seasonal 56 desiccation (Myers 1952; Simpson 1979). Desiccation kills the adults, but embryos survive 57 inside specialized eggs (Thompson et al. 2017a) buried in the soil via three diapause stages 58 (DI, DII, DIII; Wourms 1972a, 1972b, 1972c). DI occurs as a migratory dispersion of 59 blastomeres, DII occurs during somitogenesis when organs are rudimentary, and DIII occurs 60 after organogenesis when the embryo is fully formed and poised to hatch upon habitat 61 inundation. This seasonal life history is a remarkable example of convergent evolution with 62 seven gains across killifish evolution (Thompson et al. 2021). Additionally, annual killifishes 63 show rapid aging due to relaxed selection on lifespan (Cui et al. 2019) and are an important 64 emerging model system for the study of senescence (Valenzano et al. 2011, 2015; Reichwald et 65 al. 2015; Harel et al. 2015). 66 The Rio Pearlfish, Nematolebias whitei, is a seasonal killifish endemic to the coastal 67 plains of the Rio de Janeiro region in Brazil, inhabiting pools that dry twice annually, from July- 68 August and February-March (Figure 1A; Myers 1942; Costa 2002). Pearlfish represents a 69 separate origin of seasonality from other killifish model species, i.e., Nothobranchius furzeri and 70 Austrofundulus limnaeus (Thompson et al. 2021). In N. whitei, DI and DII are facultative, and 71 DIII is a “prolonged”, “deep” stasis compared to hatching delay and DIII in other killifishes, 72 occurring just before environmentally-cued hatching upon submersion in water (Wourms 1972c; 73 Thompson & Ortí 2016; Thompson et al. 2017). Pearlfish was suggested as a top candidate 74 species for killifish models in the seminal work of developmental biologist John P. Wourms in 75 1967. They are small, prolific, and hardy, and spawn in sand (Wourms 1967), making them 76 easily reared laboratory animals that are furthermore amenable to genetic manipulation like 77 other killifishes (Aluru et al. 2015; Harel et al. 2015). Pearlfish has also been an emergent 78 system to study aging (Ruijter 1987), environmental influences on development (Ruijter et al. 3 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 55 1984), the role of prolactin in hatching control (Schoots et al. 1983; Ruijter and Creuwels 1988), 80 resilience to perturbations in development with the ability to develop normally from diblastomeric 81 eggs (Carter and Wourms 1993), and the transcriptional control of diapause and hatching 82 (Thompson and Ortí 2016). 83 Here, we construct a chromosome-level genome assembly for the Rio Pearlfish, utilizing 84 Hi-C contact maps, genome annotations, and gene expression analyses to characterize 85 genomic evolution and hatching biology in this extremophile vertebrate. 86 87 Materials & Methods: 88 89 Genome sequencing and assembly 90 91 92 All animal work was approved by the Michigan State University Institutional Animal Care and Use Committee (PROTO202000108). 93 A total of 1.25 ng of template genomic DNA extracted from the liver of a single adult 94 female N. whitei was loaded on a Chromium Genome Chip. Whole genome sequencing libraries 95 were prepared using 10X Genomics Chromium Genome Library & Gel Bead Kit v.2, Chromium 96 Genome Chip Kit v.2, Chromium i7 Multiplex Kit, and Chromium controller according to 97 manufacturer’s instructions with one modification. Briefly, gDNA was combined with Master Mix, 98 a library of Genome Gel Beads, and partitioning oil to create Gel Bead-in-Emulsions (GEMs) on 99 a Chromium Genome Chip. The GEMs were isothermally amplified. Prior to Illumina library 100 construction, the GEM amplification product was sheared on a Covaris E220 Focused 101 Ultrasonicator to ~350bp then converted to a sequencing library following the 10X standard 102 operating procedure. A total of 679.43 M read pairs were sequenced on an Illumina HiSeqX 103 sequencer, and a de novo assembly was constructed with Supernova 2.1.1 (Weisenfeld et al. 104 2018). 4 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 79 A Chicago library was prepared as described previously (Putnam et al. 2016). Briefly, 106 ~500ng of HMW gDNA was reconstituted into chromatin in vitro and fixed with formaldehyde. 107 Fixed chromatin was digested with DpnII, the 5’ overhangs filled in with biotinylated nucleotides, 108 and then free blunt ends were ligated. After ligation, crosslinks were reversed, and the DNA was 109 purified from protein. Purified DNA was treated to remove biotin that was not internal to ligated 110 fragments. The DNA was then sheared to ~350 bp mean fragment size and sequencing libraries 111 were generated using NEBNext Ultra enzymes and Illumina-compatible adapters. Biotin- 112 containing fragments were isolated using streptavidin beads before PCR enrichment of each 113 library. The libraries were sequenced on an Illumina HiSeqX to produce 242 million 2x150 bp 114 paired end reads. 115 A Dovetail Hi-C library was prepared in a similar manner as described previously 116 (Lieberman-Aiden et al. 2009). For each library, chromatin was fixed in place with formaldehyde 117 in the nucleus and then extracted. Fixed chromatin was digested with DpnII, the 5’ overhangs 118 filled in with biotinylated nucleotides, and then free blunt ends were ligated. After ligation, 119 crosslinks were reversed, and the DNA purified from protein. Purified DNA was treated to 120 remove biotin that was not internal to ligated fragments. The DNA was then sheared to ~350 bp 121 mean fragment size and sequencing libraries were generated using NEBNext Ultra enzymes 122 and Illumina-compatible adapters. Biotin-containing fragments were isolated using streptavidin 123 beads before PCR enrichment of each library. The libraries were sequenced on an Illumina 124 HiSeqX to produce 179 million 2x150 bp paired end reads. 125 The Supernova de novo assembly built from 10x Chromium data, Chicago library reads, 126 and Dovetail Hi-C library reads were used as input data for assembly scaffolding with HiRise v1 127 (Putnam et al. 2016). An iterative analysis was conducted. First, Chicago library sequences 128 were aligned to the draft input assembly using a modified SNAP read mapper 129 (http://snap.cs.berkeley.edu). The separations of Chicago read pairs mapped within draft 130 scaffolds were analyzed by HiRise v1 to produce a likelihood model for genomic distance 5 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 105 between read pairs, and the model was used to identify and break putative misjoins, to score 132 prospective joins, and make joins above a threshold. After aligning and scaffolding Chicago 133 data, Dovetail Hi-C library sequences were aligned, and scaffolds were generated following the 134 same approach. Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 131 135 136 Genome annotation 137 138 The Rio Pearlfish genome was annotated with the NCBI Euakryotic genome annotation 139 pipeline v9.0 (Thibaud-Nissen et al. 2016) and with MAKER 2.31 (Cantarel et al. 2008; 140 Campbell et al. 2014; Bowman et al. 2017) using protein evidence from 15 fish species 141 (Supplementary Table S1) and transcriptome evidence from Rio Pearlfish DIII embryos and 142 hatched larvae (Thompson and Ortí 2016). Genome assembly and annotation completeness 143 (Supplementary Table S2) were analyzed with BUSCO v5 (Simão et al. 2015) and CEGMA 2.4 144 (Parra et al. 2007) via the gVolante server (Nishimura et al. 2017, https://gvolante.riken.jp). 145 146 Phylogenetics and orthology 147 148 To confirm species identification, we extracted and concatenated the barcoding marker 149 genes cox1 and cytb from our genome assembly, aligned them with orthologous sequences 150 from all three described Nematolebias species (Costa et al. 2014) and inferred a phylogeny 151 partitioned by codon and gene with RAXML (Stamatakis 2006, 2014) with the following 152 parameters: -T 4 -N autoMRE -m GTRCAT -c 25 -p 12345 -f a -x 12345 --asc-corr lewis. We 153 used Orthofinder v2.4.1 (Emms and Kelly 2015) to identify orthologous protein sequences 154 between N. whitei and 35 other vertebrates genomes (Supplementary Table S3) as well as 155 protein sequences obtained from Cui et al. (2019), Hara et al. (2018), and the longest isoforms 156 of other species available on NCBI RefSeq (last accessed September 22, 2021) downloaded 6 with orthologr’s retrieve longest isoforms function (Drost et al. 2015). The output of Orthofinder 158 (Supplementary Table S4) was examined to identify Pearlfish-specific orthogroups. Genes in 159 these orthogroups were used as queries in BLAST searches (e value cutoff of e-3) against 160 Japanese medaka (HdrR strain, assembly ASM223467v1) protein sequences downloaded from 161 Ensembl (last accessed January 17, 2022, Supplementary Table S5). We performed a 162 statistical overrepresentation test with a Fisher’s exact test and a false discovery rate correction 163 on the Gene Ontologies (GOs) of these medaka genes using Panther v.16.0 (Mi et al. 2021) 164 with the GO biological processes complete database (Supplementary Table S6). 165 166 Synteny and genome 3-D structure 167 168 We examined conservation of synteny using genome assemblies and NCBI annotations 169 for Rio Pearlfish, medaka (oryLat2, UCSC), and zebrafish (GCF_000002035.5_GRCz10, NCBI) 170 as input for SynMap in the online CoGe database and toolkit (Lyons and Freeling 2008; last 171 accessed October 14, 2021). Bwa v 0.7.17 (Li and Durbin 2009) was used to independently 172 map Rio Pearlfish Hi-C read pairs to the genome assembly with the following parameters: bwa 173 mem -A1 -B4 -E50 -L0, and HiCExplorer 3.6 was used to construct a Hi-C matrix with the 174 resulting bam files as follows: hicBuildMatrix --binSize 10000 --restrictionSequence GATC -- 175 danglingSequence GATC. The matrix was corrected via hicCorrectMatrix correct -- 176 filterThreshold -1.5 5. The matrix was binned depending on preferred resolution for viewing. 177 Contact maps were visualized with hicPlotMatrix --log1p, and compared to contact maps of 178 syntenic regions in medaka and zebrafish (Nakamura et al. 2021). 179 180 181 182 7 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 157 183 Repeat content and repeat landscape 184 We constructed a species-specific repeat database with Repeat Modeler 2.0.1 (Smit and 186 Hubley 2008). This library as well as vertebrate Repbase annotations (Jurka 2000) (downloaded 187 15 November 2017), and repeat libraries from platyfish (Schartl et al. 2013), coelacanth 188 (Amemiya et al. 2013), bowfin (Thompson et al. 2021), and spotted gar (Braasch et al. 2016) 189 were combined to annotate repeat elements with Repeat Masker v4.0.5 (Smit et al. 2013). 190 CalcDivergenceFromAlign.pl and createRepeatLandscape.pl in the Repeat Masker package 191 were used to generate a repeat landscape. We graphically compared the repeat landscape of 192 Rio Pearlfish to those described for other sequenced killifish species (Reichwald et al. 2015; 193 Valenzano et al. 2015; Rhee et al. 2017; Cui et al. 2019) to identify similarities and difference in 194 the magnitude and location of peaks at different Kimura distances in the histograms. 195 196 Hatching enzyme gene expression 197 198 Aquatic vertebrates hatch by secreting choriolytic enzymes from hatching gland cells 199 that break down the egg chorion (Yamagami 1988; Hong and Saint-Jeannet 2014). Teleost 200 fishes underwent hatching enzyme gene duplications followed by divergence and functional 201 divergence into the high choriolytic enzyme (hce) and low choriolytic enzyme (lce) genes 202 (Yasumasu et al. 1992; Kawaguchi et al. 2006, 2010; Sano et al. 2014). We used BLAST to 203 search the well-studied medaka hatching enzyme paralogs (lce and hce) against the annotated 204 Pearlfish genome. We used the Pearlfish gene sequences from these BLAST hits as well as 205 metalloprotease gene sequences from medaka, Austrofundulus, Kryptolebias, and 206 Nothobranchius (accession numbers in Supplementary Table S7) to infer gene trees. The 207 identified Pearlfish lce and hce genes are orthologous to those of other teleosts (data not 208 shown). Pearlfish hatching enzyme orthologs were examined for transcript evidence from DIII 8 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 185 209 embryos (Thompson & Ortí 2016) to identify active lce and hce gene expression in Pearlfish 210 DIII. We generated an antisense RNA probe for the Pearfish lce.2 gene and performed 212 whole-mount RNA in situ hybridization to identify hatching enzyme gene expression patterns as 213 markers for the location of hatching gland cells in Pearlfish. Total RNA was extracted from DIII 214 Pearlfish embryos with a Qiagen RNeasy mini plus kit and reverse transcribed with a 215 superscript IV VILO kit (ThermoFisher) according to manufacturers’ instructions. The lce cDNA 216 was amplified from the reverse transcribed template via PCR (Primers: Nwh_lce.1_1F: 5’- 217 ATGGACCATAAAGCAAAAGTTTCTCTC-3’ ; Nwh_lce.1_792R: 5’- 218 CTATTGCTTGTATTTTGAACACTTGT-3’ ; Nwh_lce.2_1F: 5’- 219 ATGGACCATAAAGCAAAAGTTACTCTT-3’ ; Nwh_lce.2_825R: 5’- 220 CTATTGCTTGTATTTTGAACAGTTGT-3’) and lce.2 was inserted into a TOPO TA cloning kit 221 vector (Invitrogen) according to manufacturer’s instructions. Whole mount lce.2 mRNA in situ 222 hybridization on manually dechorionated DIII Rio Pearlfish embryos was performed following 223 Deyts et al. (2005) with a 25ug/mL proteinase K digestion treatment for 45min (n=3 embryos), 224 60min (n=3 embryos), and 90min (n=2 embryos). 225 226 227 228 229 Results and Discussion: 230 231 Genome sequencing and assembly 232 233 We report a high-quality, 1.2 Gb chromosome-level genome assembly of N. whitei. The 234 Rio Pearlfish genome assembly consists of 24 chromosomal pseudomolecules represented by 9 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 211 24 superscaffolds and matches the described karyotype (n=24; Von Post, 1965). The assembly 236 has an N50 over 49.98 Mb and an L50 of 11 scaffolds (Table 1). BUSCO and CEGMA scores 237 for different core gene databases indicate a high-quality assembly with an average of 94% 238 complete BUSCOS and CEGs across all relevant databases (Table 1, Supplementary Table 239 S2). 240 241 Genome annotation 242 243 The NCBI Nematolebias whitei Annotation Release 100.20210725 contains 23,038 244 genes, with 21,341 protein coding genes, similar to other, chromosome-level killifish genome 245 assemblies from Nothobranchius furzeri and Kryptolebias marmoratus (Reichwald et al. 2015; 246 Valenzano et al. 2015; Kelley et al. 2016; Rhee et al. 2017) (Supplementary Table S8). Minor 247 differences in gene numbers among killifish species are likely due to annotation methods, and 248 minor species-specific gene losses or expansions. The number and content of annotated genes 249 can be influenced by evidence used for annotation, differences in gene model prediction 250 likelihoods, and post-annotation filtering (Holt and Yandell 2011; Campbell et al. 2014). MAKER 251 annotated 26,016 protein coding genes, on par with the NCBI annotation. See Table 1 and 252 Supplementary Table S2 and S8 for Rio Pearlfish genome annotation statistics. Although our 253 BUSCO analyses show fewer genes missed by the NCBI annotation (Supplementary Table S2), 254 our additional MAKER annotation provides additional, valid gene models missed by the NCBI 255 pipeline. For example, MAKER annotates 28 vertebrate and 27 actinopterygian BUSCOs 256 missed by the NCBI annotation pipeline (Supplementary Table S9). 257 258 259 260 10 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 235 261 Phylogenetics and orthology 262 Our Orthofinder analysis illustrates the phylogenetic position of Rio Pearlfish among 264 vertebrates (Figure 1B) and identified 31,317 orthogroups across 36 vertebrate species with 265 99.2% of Rio Pearlfish genes within orthogroups (Table 1, Supplementary Table S4). We 266 identified 7,287 orthogroups present across all species from sharks to human to Rio Pearlfish, 267 highlighting the utility of the Rio Pearlfish genome to connect species with extreme 268 developmental phenotypes to other vertebrates, including traditional vertebrate model species 269 such as mouse, Xenopus, zebrafish, etc. We confirmed the identity of our genome specimen 270 with barcoding and a molecular phylogeny of cox1 and cytb with its position located within the 271 N. whitei clade of Nematolebias killifishes (Figure 1C). We found a total of 17 Pearlfish-specific 272 orthogroups comprising a total of 42 protein sequences. For 39 of these, we established 273 homology to a medaka gene class by BLAST (Supplementary Table S5) and found an 274 overrepresentation for GO terms related to glycolysis (Supplementary Table S6). This may 275 indicate an adaptive expansion of metabolic genes in this species as annual killifishes tolerate 276 anoxia (Podrabsky et al. 2007, 2012; Wagner et al. 2018), severely depress metabolic rate 277 during diapause (Podrabsky and Hand 1999), and drastically increase metabolic rate during fast 278 maturation (Vrtílek et al. 2018) necessary for an annual life cycle. In a separate study, we have 279 also found that genes involved in cell respiration, specifically oxidative phosphorylation, show 280 higher ratios of non-synonymous/synonymous codon changes in annual killifishes compared to 281 their non-annual counterparts (Thompson et al., 2021). Together, these observations points to 282 potential positive selection on genes involved in cell respiration in annual killifishes. 283 284 285 286 11 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 263 287 Synteny and genome 3-D structure 288 289 Three-dimensional chromatin structure impacts gene regulation and can manifest as topologically associated domains (TADs) that could represent higher order gene regulatory 291 regions conserved across evolution (Krefting et al. 2018). However, 3-D genome structure has 292 thus far remained uncharacterized in annual killifishes. To confirm the quality of the genome 293 assembly and assess the utility of the chromatin conformation data to interrogate 3-D genome 294 structure and gene regulation, we constructed a Hi-C contact map showing higher contact 295 frequency within the 24 pearlfish chromosomes (Figure 1D) than between chromosomes. Using 296 the genome sequence and gene annotations for Rio Pearlfish in synteny comparisons to 297 another atherinomorph teleost, the medaka (separated by ~85 Million years), and the 298 ostariophysian teleost zebrafish (separated by ~224 million years), we reveal largely conserved 299 synteny for these species (Figure 2E,F) across millions of years of teleost evolution (Thompson 300 et al. 2021; Hughes et al. 2018). We examined a TAD previously shown to be conserved from 301 zebrafish to medaka (Nakamura et al. 2021) and found high frequency of contacts in the 302 syntenic region between rasa1a and mctp1a in Rio Pearlfish liver tissue that strikingly 303 resembles contact maps both in a medaka fibroblast cell line and zebrafish whole embryos 304 (Figure 1G). Hi-C analyses thus confirm the high-quality of our genome assembly as well as the 305 strikingly conserved nature of 3-D genome interactions across teleost evolution, developmental 306 stages, and among cell and tissue types. 307 308 Repeat content and transposable element landscape 309 310 Transposable elements (TEs) are hypothesized to generate novel genetic substrate for 311 adaptations (Casacuberta and González 2013; Feiner 2016; Esnault et al. 2019). Some annual 312 killifish species have expanded TE content compared to non-annual relatives (Cui et al. 2019), 12 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 290 and the link between TEs, aging, and human diseases (Bravo et al. 2020) coupled with the rapid 314 senescence of annual killifishes highlights the importance of examining the Pearlfish 315 “mobilome.” We found that the Rio Pearlfish genome is highly repetitive with a repeat content of 316 ~57% (Figure 2A, Table 1, Supplementary Table S10) which is substantially elevated compared 317 to a non-annual member of the same South American family, Kryptolebias marmoratus, with a 318 repeat content of around ~27% (Rhee et al. 2017; Choi et al. 2020). Similarly, African annual 319 Nothobranchius killifishes have higher TE repeat content than their non-annual relatives (Cui et 320 al. 2019). This pattern might be the result of adaptation to extreme environments as animals, 321 fungi, and plants have co-opted TEs for environmental adaptations in harsh conditions 322 (Casacuberta and González 2013; Esnault et al. 2019) and TEs may play roles in vertebrate 323 adaptive radiations (Feiner 2016). Our findings further highlight the expanded repeat content in 324 annual killifish genomes and the Pearlfish genome provides novel resources to study the role of 325 mobile DNA in extremophiles. 326 327 Hatching enzyme gene expression and hatching gland location 328 329 While hatching from the egg is a critical time point during animal development, little is 330 known about its genetic regulation and the integration of environmental cues. Additionally, 331 development of hatching gland cells (HGCs) is dynamic among fishes (Inohaya et al. 1995; 332 Inohaya et al. 1997) as they migrate and localize in different anatomical locations in different 333 species (Korwin-Kossakowski 2012; Shimizu et al. 2014; Nagasawa et al. 2016). Pinpointing 334 HGC location in seasonal killifishes is necessary for understanding the regulation of hatching in 335 extreme environments. 336 Rio Pearlfish is a tractable model for studying hatching regulation since hatching is 337 easily induced in this species by exposing DIII embryos to water (Thompson 2016). Thus, we 338 examined the hatching enzyme gene repertoire and HGC locations in Pearlfish. We identified 13 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 313 five expressed hatching enzyme genes (Figure 2B, three hce and two lce genes) upon mapping 340 DIII mRNA reads from Thompson & Ortí (2016) to our reference genome assembly. We 341 annotated hce1 and hce2 on chromosome 2 (corresponding NCBI genes LOC119423801, 342 LOC119423789), and hce3 on chromosome 20 (LOC119426643) and the adjacent lce.1 and 343 lce.2 genes (LOC119418488, LOC119418489) on chromosome 12 that are species-specific 344 tandem duplicates (Figure 2B) supported by transcript evidence (Thompson and Ortí 2016). 345 Using whole mount RNA in situ hybridization for lce.2 in DIII embryos, we identified HGC 346 locations in the buccal and pharyngeal cavity in Rio Pearlfish (Figure 2C,D) similar to HGC 347 localization in the related mummichog or Atlantic killifish (Fundulus heteroclitus) (Kawaguchi et 348 al. 2005) and in medaka (Inohaya et al. 1995). 349 A pattern of expanded hce genes is also found in other Atherinomorph fishes like 350 medaka. High choriolytic enzyme genes in this clade of teleosts have lost introns (Kawaguchi et 351 al. 2010) and subfunctionalized post duplication with some hce genes performing better at 352 higher or lower salinities in the euryhaline medaka Oryzias javanicus (Takehana et al. 2020) 353 and the Atlantic killifish Fundulus heteroclitus (Kawaguchi et al. 2013b). Furthermore, the 354 duplication of the lce gene in Rio Pearlfish is an example of convergent evolution within teleosts 355 with another lce duplication in stickleback fishes (Kawaguchi et al. 2013a). These findings 356 underscore the commonality of hatching enzyme gene duplications in teleost fishes that 357 provides a model system for studying convergent gene duplications and functional divergence 358 by sub- and neofunctionalizations. 359 360 Conclusions 361 362 Our chromosome-level, dually annotated genome assembly of the Rio Pearlfish provides 363 a valuable comparative genomics resource strengthening the utility of killifishes for studying 364 aging, suspended animation, and response to environmental stress. The Rio Pearlfish is an 14 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 339 emerging “Extremo” Eco-Evo-Devo research organism, and this reference genome will be a 366 substrate for future functional genetic and multi-omic approaches exploring how organisms 367 integrate developmental and environmental cues to adapt to extreme environmental conditions 368 in a changing world. 369 370 371 372 Data Availability Statement: The genome sequence, annotation, and sequence read data are available on NCBI 373 under accession GCA_014905685.2 and Bioproject PRJNA560526. The genome assembly and 374 annotation has also been integrated to the University of California Santa Cruz Genome Browser 375 (https://hgdownload.soe.ucsc.edu/hubs/fish/index.html). The MAKER genome annotation is 376 available on github (https://github.com/AndrewWT/NematolebiasGenomics). Supplemental 377 material is available at G3 online. 378 379 380 Acknowledgements: We thank Camilla Peabody for guidance with RNA in situ hybridization, Kevin Childs for 381 computational resources, and Françoise Thibaud-Nissen for help integrating the genome into 382 NCBI’s Eukaryotic Annotation Pipeline. 383 384 385 Author Contributions: AWT and IB conceived the project, wrote the manuscript, and acquired funding; genome 386 sequencing and assembly was performed with Dovetail Genomics; MD, HWP, AWT, and IB 387 analyzed hatching enzyme genes; HWP and AWT performed RNA in situ hybridization; AWT 388 performed comparative genomic analyses, and genome structure analyses; AWT and IB 389 analyzed the repeat content. 390 15 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 365 391 392 Conflict of Interest: The authors declare that there is no conflict of interest. 393 395 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 394 Funder Information: This work was supported by the NSF BEACON Center for the Study of Evolution in 396 Action (Cooperative Agreement No. DBI-0939454), project #1233 (to AWT and IB) and NIH 397 ORIP grant R01OD011116 (to IB). 398 16 Literature Cited: 400 401 402 Aluru, N., S. I. Karchner, D. G. Franks, D. Nacci, D. Champlin et al., 2015 Targeted mutagenesis of aryl hydrocarbon receptor 2a and 2b genes in Atlantic killifish (Fundulus heteroclitus). Aquat. Toxicol. 158: 192–201. 403 404 Amemiya, C., J. Alföldi, A. P. Lee, S. Fan, H. Philippe et al., 2013 The African coelacanth genome provides insights into tetrapod evolution. Nature 496: 311–316. 405 406 407 Bowman, M. J., J. A. Pulman, T. L. Liu, and K. L. Childs, 2017 A modified GC-specific MAKER gene annotation method reveals improved and novel gene predictions of high and low GC content in Oryza sativa. BMC Bioinformatics 18: 1–15. 408 409 410 Braasch, I., A. R. Gehrke, J. J. Smith, K. Kawasaki, T. Manousaki et al., 2016 The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. Nat. Genet. 48: 427–437. 411 412 413 Bravo, J. I., S. Nozownik, P. S. Danthi, and B. A. Benayoun, 2020 Transposable elements, circular RNAs and mitochondrial transcription in age-related genomic regulation. Development 147: 1–18. 414 415 Campbell, M. S., C. Holt, B. Moore, and M. Yandell, 2014 Genome Annotation and Curation Using MAKER and MAKER-P. John Wiley and Sons Inc. 416 417 418 Cantarel, B. L., I. Korf, S. M. C. Robb, G. Parra, E. Ross et al., 2008 MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18: 188–196. 419 420 421 Carter, C. A., and J. P. Wourms, 1993 Naturally occurring diblastodermic eggs in the annual fish Cynolebias: Implications for developmental regulation and determination. J. Morphol. 215: 301–312. 422 423 Casacuberta, E., and J. González, 2013 The impact of transposable elements in environmental adaptation. Mol. Ecol. 22: 1503–1517. 424 425 426 Choi, B. S., J. C. Park, M. S. Kim, J. Han, D. H. Kim et al., 2020 The reference genome of the selfing fish Kryptolebias hermaphroditus: Identification of phases I and II detoxification genes. Comp. Biochem. Physiol. - Part D Genomics Proteomics 35. 427 428 429 Costa, W. J. E. M., 2002 The neotropical seasonal fish genus Nematolebias (Cyprinodontiformes Rivulidae Cynolebiatinae) taxonomic revision with description of a new species. Ichthyol. Explor. Freshw. 1: 41–52. 430 431 432 433 Costa, W. J. E. M., P. F. Amorim, and G. N. Aranha, 2014 Species Limits and DNA Barcodes in Nematolebias, a Genus of seasonal Killifishes Threatened with Extinction from the Atlantic Forest of South-Eastern Brazil, with Description of a New Species (Teleostei Rivulidae). Ichthyol. Explor. Freshwaters 24: 225–236. 434 435 Cui, R., T. Medeiros, D. Willemsen, L. N. M. Iasi, G. E. Collier et al., 2019 Relaxed Selection Limits Lifespan by Increasing Mutation Load. Cell. 178: 385-399. 436 437 Deyts, C., E. Candal, J. S. Joly, and F. Bourrat, 2005 An automated in situ hybridization screen in the medaka to identify unknown neural genes. Dev. Dyn. 234: 698–708. 438 439 440 Drost, H. G., A. Gabel, I. Grosse, and M. Quint, 2015 Evidence for active maintenance of phylotranscriptomic hourglass patterns in animal and plant embryogenesis. Mol. Biol. Evol. 32: 1221–1231. 441 Emms, D. M., and S. Kelly, 2015 OrthoFinder: solving fundamental biases in whole genome 17 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 399 comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16: 1– 14. 444 445 Esnault, C., M. Lee, C. Ham, and H. L. Levin, 2019 Transposable element insertions in fission yeast drive adaptation to environmental stress. Genome Res. 29: 85–95. 446 447 Feiner, N., 2016 Accumulation of transposable elements in hox gene clusters during adaptive radiation of anolis lizards. Proc. R. Soc. B Biol. Sci. 283. 448 449 450 Furness, A. I., D. N. Reznick, M. S. Springer, and R. W. Meredith, 2015 Convergent evolution of alternative developmental trajectories associated with diapause in African and South American killifish. Proc. R. Soc. B Biol. Sci. 282: 20142189. 451 452 453 Hara, Y., K. Yamaguchi, K. Onimaru, M. Kadota, M. Koyanagi et al., 2018 Shark genomes provide insights into elasmobranch evolution and the origin of vertebrates. Nat. Ecol. Evol. 2: 1761–1771. 454 455 456 Harel, I., B. A. Benayoun, B. E. Machado, P. P. Singh, C. K. Hu et al., 2015 A platform for rapid exploration of aging and diseases in a naturally short-lived vertebrate. Cell 160: 1013– 1026. 457 458 Holt, C., and M. Yandell, 2011 MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12: 491. 459 460 Hong, C. S., and J. P. Saint-Jeannet, 2014 Xhe2 is a member of the astacin family of metalloproteases that promotes Xenopus hatching. Genesis 52: 946–951. 461 462 463 Hughes, L. C., G. Ortí, Y. Huang, Y. Sun, C. C. Baldwin et al., 2018 Comprehensive phylogeny of ray-finned fishes (Actinopterygii) based on transcriptomic and genomic data. Procedings Natl. Acad. Sci. 115: 6249-6252. 464 465 466 Inohaya, K., S. Yasumasu, K. Araki, K. Naruse, K. Yamazaki et al., 1997 Species-dependent migration of fish hatcing gland cells that commonly express astacin-like proteases in common. Dev. Growth, Differ. 39: 191–197. 467 468 469 Inohaya, K., S. Yasumasu, M. Ishimaru, A. Ohyama, I. Iuchi et al., 1995 Temproral and spatial patterns of gene expression for the hatching enzyme in the teleost embryo, Oryzias latipes Dev. Biol. 171: 374–385. 470 471 Jurka, J., 2000 Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 16: 418–420. 472 473 Kawaguchi, M., J. Hiroi, M. Miya, M. Nishida, I. Iuchi et al., 2010 Intron-loss evolution of hatching enzyme genes in Teleostei. BMC Evol. Biol. 10: 1–10. 474 475 476 Kawaguchi, M., H. Takahashi, Y. Takehana, K. Naruse, M. Nishida et al., 2013a SubFunctionalization of Duplicated Genes in the Evolution of Nine-Spined Stickleback Hatching Enzyme. J. Exp. Zool. Part B Mol. Dev. Evol. 320: 140–150. 477 478 Kawaguchi, M., S. Yasumasu, and J. Hiroi, 2006 Evolution of teleostean hatching enzyme genes and their paralogous genes. Dev. Genes Evol. 216: 769–784. 479 480 481 482 483 Kawaguchi, M., S. Yasumasu, A. Shimizu, J. Hiroi, N. Yoshizaki et al., 2005 Purification and gene cloning of Fundulus heteroclitus hatching enzyme: A hatching enzyme system composed of high choriolytic enzyme and low choriolytic enzyme is conserved between two different teleosts, Fundulus heteroclitus and medaka Oryzias latipes. FEBS J. 272: 4315–4326. 484 Kawaguchi, M., S. Yasumasu, A. Shimizu, N. Kudo, K. Sano et al., 2013b Adaptive evolution of 18 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 442 443 fish hatching enzyme : one amino acid substitution results in differential salt dependency of the enzyme. J. Exp. Biol. 216: 1609–1615. 487 488 489 490 Kelley, J. L., M. C. Yee, A. P. Brown, R. R. Richardson, A. Tatarenkov et al., 2016 The genome of the self-fertilizing mangrove rivulus fish, Kryptolebias marmoratus : a model for studying phenotypic plasticity and adaptations to extreme environments. Genome Biol. Evol. evw145. 491 492 Korwin-Kossakowski, M., 2012 Fish hatching strategies: A review. Rev. Fish Biol. Fish. 22: 225– 240. 493 494 495 Krefting, J., M. A. Andrade-Navarro, and J. Ibn-Salem, 2018 Evolutionary stability of topologically associating domains is associated with conserved gene regulation. BMC Biol. 16: 1–12. 496 497 Li, H., and R. Durbin, 2009 Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform. Bioinformatics 25: 1754–1760. 498 499 500 Lieberman-Aiden, E., N. L. van Berkum, L. Williams, M. Imakaev, T. Ragoczy et al., 2009 Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science 326: 285–289. 501 502 Lyons, E., and M. Freeling, 2008 How to usefully compare homologous plant genes and chromosomes as DNA sequences. Plant J. 53: 661–673. 503 504 505 Mi, H., D. Ebert, A. Muruganujan, C. Mills, L. P. Albou et al., 2021 PANTHER version 16: A revised family classification, tree-based classification tool, enhancer regions and extensive API. Nucleic Acids Res. 49: D394–D403. 506 Myers, G. S., 1952 Annual Fishes. Aquarium J. 23: 125–141. 507 508 Myers, G. S., 1942 Studies on South American freshwater fishes I. Stanford Ichthyol. Bull. 2: 89–114. 509 510 511 Nagasawa, T., M. Kawaguchi, T. Yano, K. Sano, M. Okabe et al., 2016 Evolutionary Changes in the Developmental Origin of Hatching Gland Cells in Basal Ray-Finned Fishes. Zoolog. Sci. 33: 272–281. 512 513 Nakamura, R., Y. Motai, M. Kumagai, C. L. Wike, H. Nishiyama et al., 2021 CTCF looping is established during gastrulation in medaka embryos. Genome Res. 31: 968–980. 514 515 Nishimura, O., Y. Hara, and S. Kuraku, 2017 GVolante for standardizing completeness assessment of genome and transcriptome assemblies. Bioinformatics 33: 3635–3637. 516 517 Parra, G., K. Bradnam, and I. Korf, 2007 CEGMA: A pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23: 1061–1067. 518 519 Podrabsky, J. E., and S. C. Hand, 1999 The Bioenergetics of Embryonic Diapause in an Annual Killifish, Austrofundulus limneaus. J. Exp. Biol. 202: 2567–2580. 520 521 522 Podrabsky, J. E., J. P. Lopez, T. W. M. M. Fan, R. Higashi, and G. N. Somero, 2007 Extreme anoxia tolerance in embryos of the annual killifish Austrofundulus limnaeus: insights from a metabolomics analysis. J Exp Biol 210: 2253–2266. 523 524 525 Podrabsky, J. E., M. A. Menze, and S. C. Hand, 2012 Long-Term survival of anoxia despite rapid ATP decline in embryos of the annual killifish Austrofundulus limnaeus. J Exp Zool A Ecol Genet Physiol 317: 524–532. 526 527 Von Post, A., 1965 Vergleichende Untersuchungen der Chromosomenzahlen bei Süßwasser Teleosteern. Zeitschrift für Zool. Syst. und Evol. 47–93. 19 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 485 486 Putnam, N. H., B. O. Connell, J. C. Stites, B. J. Rice, P. D. Hartley et al., 2016 Chromosomescale shotgun assembly using an in vitro method for long-range linkage. Genome Res. 26: 342–350. 531 532 533 Reichwald, K., A. Petzold, P. Koch, B. R. Downie, N. Hartmann et al., 2015 Insights into Sex Chromosome Evolution and Aging from the Genome of a Short-Lived Fish. Cell 163: 1527– 1538. 534 535 536 Rhee, J. S., B. S. Choi, J. Kim, B. M. Kim, Y. M. Lee et al., 2017 Diversity, distribution, and significance of transposable elements in the genome of the only selfing hermaphroditic vertebrate Kryptolebias marmoratus. Sci. Rep. 7:. 537 538 539 Ruijter, J. M., 1987 Development and aging of the teleost pituitary: qualitative and quantitative observations in the annual cyprinodont Cynolebias whitei. Anat. Embryol. (Berl). 175: 379– 386. 540 541 Ruijter, J. M., and L. A. J. M. Creuwels, 1988 The ultrastructure of prolactin cells in the annual cyprinodont Cynolebias whitei during its life cycle. Cell Tissue Res. 253: 477–483. 542 543 544 Ruijter, J. M., J. A. M. Van Kemenade, and S. E. W. Bonga, 1984 Environmental influences on prolactin cell development in the cyprinodont fish, Cynolebias whitei. Cell Tissue Res. 238: 595–600. 545 546 547 Sano, K., M. Kawaguchi, S. Watanabe, and S. Yasumasu, 2014 Neofunctionalization of a duplicate hatching enzyme gene during the evolution of teleost fishes. BMC Evol. Biol. 14: 221. 548 549 550 Schartl, M., R. B. Walter, Y. Shen, T. Garcia, J. Catchen et al., 2013 The genome of the platyfish, Xiphophorus maculatus, provides insights into evolutionary adaptation and several complex traits. Nat Genet 45: 567–572. 551 552 553 Schoots, A. F. M., J. M. Ruijter, J. A. M. van Kemenade, and J. M. Denucé, 1983 Immunoreactive prolactin in the pituitary gland of cyprinodont fish at the time of hatching. Cell Tissue Res. 233: 611–618. 554 555 556 Shimizu, D., M. Kawaguchi, S. Yasumasu, T. Noda, Y. Fujinami et al., 2014 Comparison of Hatching Mode in Pelagic and Demersal Eggs of Two Closely Related Species in the Order Pleuronectiformes. Zoolog. Sci. 31: 709-715. 557 558 559 Simão, F. A., R. M. Waterhouse, P. Ioannidis, E. V. Kriventseva, and E. M. Zdobnov, 2015 BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31: 3210–3212. 560 Simpson, B. R. C., 1979 The phenology of annual killifish. Symp Zool Soc Lon 44: 243–261. 561 Smit, A. F. A., and R. Hubley, 2008 RepeatModeler Open-1.0. http://www.repeatmasker.org. 562 563 Smit, A. F. A., R. Hubley, and P. Green, 2013 RepeatMasker Open-4.0. <http://www.repeatmasker.org>. 564 565 Stamatakis, A., 2006 RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688–2690. 566 567 Stamatakis, A., 2014 RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30: 1312–1313. 568 569 570 Takehana, Y., M. Zahm, C. Cabau, C. Klopp, C. Roques et al., 2020 Genome sequence of the euryhaline Javafish Medaka, Oryzias javanicus: A small aquarium fish model for studies on adaptation to salinity. G3 Genes, Genomes, Genet. 10: 907–915. 20 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 528 529 530 Thibaud-Nissen, F., M. DiCuccio, W. Hlavina, A. Kimchi, P. A. Kitts et al., 2016 P8008 The NCBI Eukaryotic Genome Annotation Pipeline. J. Anim. Sci. 94: 184. 573 574 Thompson, A. W., A. C. Black, Y. Huang, Q. Shi, A. I. Furness et al. Deterministic shifts in molecular evolution correlate with convergence to annualism in killifishes. BioRxiv. 575 576 Thompson, A. W., A. I. Furness, C. Stone, C. Rade, and G. Ortí, 2017a Microanatomical diversification of the zona pellucida in aplochelioid killifishes. J. Fish Biol. 91: 126–143. 577 578 Thompson, A. W., M. B. Hawkins, E. Parey, D. J. Wcisel, T. Ota et al., 2021 The bowfin genome illuminates the developmental evolution of ray-finned fishes. Nat. Genet. 53: 1373-1384. 579 580 Thompson, A. W., A. Hayes, J. E. Podrabsky, and G. Ortí, 2017b Gene expression during delayed hatching in fish-out-of-water. Ecol. Genet. Genomics 3–5: 52–59. 581 582 Thompson, A. W., and G. Ortí, 2016 Annual killifish transcriptomics and candidate genes for metazoan diapause. Mol. Biol. Evol. 33: 2391–2395. 583 584 585 Valenzano, D. R., B. A. Benayoun, P. P. Singh, E. Zhang, P. D. Etter et al., 2015 The African Turquoise Killifish Genome Provides Insights into Evolution and Genetic Architecture of Lifespan. Cell 163: 1539–1554. 586 587 588 Valenzano, D. R., S. C. Sharp, and A. Brunet, 2011 Transposon-Mediated Transgenesis in the Short-Lived African Killifish Nothobranchius furzeri, a Vertebrate Model for Aging. G3 (Bethesda). 1: 531–8. 589 590 Vrtílek, M., J. Žák, M. Pšenička, and M. Reichard, 2018 Extremely rapid maturation of a wild African annual fish. Curr. Biol. 28: R822–R824. 591 592 593 Wagner, J. T., P. P. Singh, A. L. Romney, C. L. Riggs, P. Minx et al., 2018 The genome of Austrofundulus limnaeus offers insights into extreme vertebrate stress tolerance and embryonic development. BMC Genomics 19: 1–21. 594 595 Weisenfeld, N. I., V. Kumar, P. Shah, D. M. Church, and D. B. Jaffe, 2018 Corrigendum: Direct determination of diploid genome sequences. Genome Res. 28: 757–767. 596 597 Wourms, J. P., 1967 Annual Fishes, pp. 123–137 in Methods in Developmental Biology, Thomas and Crowell Company, New York. 598 599 Wourms, J. P., 1972a The Developmental Biology of Annual Fishes I. Stages in the Normal Development of Austrofundulus myersi Dahl. J. Exp. Zool 182: 143–168. 600 601 602 Wourms, J. P., 1972b The Developmental Biology of Annual Fishes II. Naturally Occurring Dispersion and Reaggregation of Blastomeres During the Development of Annual Fish Eggs. J. Exp. Zool 182: 169–200. 603 604 605 Wourms, J. P., 1972c The Developmental Biology of Annual Fishes III. Pre-embryonic and Embryonic Diapause of Variable Duration in the Eggs of Annual Fishes. J. Exp. Zool 182: 389–414. 606 607 Yamagami, K., 1988 Mechanisms of Hatching in Fish, pp. 447–499 in Fish Physiology, Academic Press. 608 609 610 Yasumasu, S., K. Yamada, K. Akasaka, K. Mitsunaga, I. Iuchi et al., 1992 Isolation of cDNAs for LCE and HCE, two constituent proteases of the hatching enzyme of Oryzias latipes, and concurrent expression of their mRNAs during development. Dev. Biol. 153: 250–258. 611 612 21 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 571 572 613 614 Table 1. Rio Pearlfish genome assembly (NemWhi1) and annotation statistics Gene Annotation Statistics MAKER Annotation # protein coding genes BUSCO scores3: Vertebrata, Actinopterygii NCBI RefSeq Annotation4 # protein coding genes BUSCO scores3: Vertebrata, Actinopterygii # genes in orthogroups5 # species-specific orthogroups (genes) 18,999 1,218,332,341 49,984,095 11 32,525,398 22 24 24 41.8% 57.3% 96.9%, 95.5% 99.19%, 99.57% 26,016 91.4%, 86.2% 21,341 97.4%, 96.5% 21,176 (99.2%) 17 (42) 1 Von Post (1965) see Supplementary Table S10 for more info 3 see Supplementary Table S2 for more info 4 see Supplementary Table S8 for more info 5 see Supplementary Table S4 for more info 615 2 22 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 Genome Assembly Statistics # scaffolds # base pairs N50 L50 N90 L90 # superscaffolds # chromosomes (n)1 GC content repeat content2 BUSCO scores3: Vertebrata, Actinopterygii CEGMA scores3: CEG, CVG Figure 1. Rio Pearlfish evolution, ecology, development, and 3-D genome structure A.) Biannual life cycle of the Rio Pearlfish with three developmental diapause stages following burying of eggs in soil. B.) Relative position of Rio Pearlfish in the vertebrate tree of life inferred by Orthofinder based on annotated proteins. C.) DNA barcode (cox1 and cytb) phylogeny inferred with RAxML of the genus Nematolebias confirming the identity of the genome specimen as N. whitei. Sequences from Costa et al. (2014) were used for comparison to the genome sequence. Green nodes show 100% bootstrap support for the reciprocal monophyly of N. whitei with other genera and confirms the identity of the genome specimen with high confidence. D.) Hi-C contact map of the Rio Pearlfish genome showing linkage of the 24 chromosomes into chromosomal pseudomolecules. E-F.) SynMap genome-wide synteny plots of Rio Pearlfish vs. medaka (E) and vs. zebrafish (F) showing genome-structure conservation across over 250 million years of teleost evolution. G.) Hi-C contact maps of the syntenic region between rasa1a and mctp1a in Pearlfish liver tissue. These contact maps highlight the conserved 3-D structure that include topologically associated domains (TADs) conserved across teleost evolution as well as cell types and developmental stages (Nakamura et al. 2021). Species graphics generated with BioRender. Figure 2. Rio Pearlfish repeat landscape, hatching enzyme genes, and hatching gland location. A.) Repeat landscape of mobile genetic elements in Rio Pearlfish showing a high repeat content with two peaks at Kimura distance 4 and 21. Insert: Total transposable element landscape among killifishes with independent, recent expansions in the convergent annuals Nothobranchius (Cui et al. 2019) and Nematolebias (this study) compared to the non-annual Kryptolebias (Choi et al. 2020) B.) Locations of five hatching enzyme genes in the Rio Pearlfish genome expressed during DIII. C-D.) Wholemount RNA in situ hybridization for lce.2 in DIII Rio Pearlfish embryos marking hatching gland cells (HGCs) identified in the buccal (BHGCs, red arrows) and pharyngeal (PHGCs, white arrow) cavities. 642 643 23 Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 A                     0.06 0.004    Contact map, 400Kb res. Whole Genome Synteny D Nothobranchius furzeri Nothobranchius orthonotus Aphyosemion australe Callopanchax toddi Pachypanchax playfairyii Austrofundulus limnaeus Nematolebias whitei Co Kryptolebias marmoratus Fundulus heteroclitus Cyprinodon variegatus Xiphophorus maculatus Poecilia formosa Oryzias latipes Betta splendens Gasterosteus aculeatus Salmo salar Carassius auratus Danio rerio Astyanax mexicanus Scleropages formosus Lepisosteus oculatus Amia calva Acipenser ruthenus Polyodon spathula Polypterus senegalus Homo sapiens Mus musculus Anolis carolinensis 0.004 Gallus gallus N. Xenopus laevis Xenopus tropicalis Latimeria chalumnae Chiloscyllium punctatum Rhincodon typus Scyliorhinus torazame Callorhinchus milii F Rio Pearlfish vs. Zebrafish Whole Genome Synteny C N. whitei isolate 6843.2 isolate 6843.4 isolate 6843.1 isolate 6843.3 isolate 6844.1 isolate 6844.4 isolate 6844.3 Constance Genome Isolate isolate 6841.2 isolate 6841.3 isolate 6841.1 isolate 6841.4 isolate 6844.2 isolate 6845.1 isolate 6845.2 isolate 6845.3 isolate 6845.4 isolate 8503.3 papilliferus isolate 8503.1 isolate 8503.4 isolate 8503.2 isolate 6842.1 isolate 6842.3 N. catimbau isolate 6842.2 G Rio Pearlfish, Chr 8, Liver 1.6 Mb, 20 Kb res. Chr 1 Chr 24 D Rio Pearlfish, Whole Genome E Rio Pearlfish vs. Medaka Aplocheiloid Killifishes Chr 1 Chr 24 Chr 1 Chr 24 Chr 1 Chr 24 rasa1a mctp1a Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022  B   A Repeat Landscape 4.0 Rio Pearlfish (Nematolebias) Repeat Landscape 6 Killifish Repeat Landscapes Nothobranchius % genome 3.5 2.5 Nematolebias 3.0 2 Kryptolebias 1.0 2.0 0 0 1.5 10 20 30 40 Kimura substitution level 50 1.0 0.5 LINE/RTE DNA/TcMar SINE/tRNA-V LINE DNA/Sola SINE/Deu LINE/L1 DNA/PiggyBac SINE/tRNA LTR/ERVK DNA/P SINE/5S LTR/ERV DNA/MULE SINE LTR/ERV1 DNA/Merlin LINE/Penelope LTR DNA LINE/R2 LTR/ERVL DNA/Maverick LINE/Dong-R4 LTR/Gypsy DNA/Kolobok LINE/Jockey-I LTR/Copia DNA/hAT LINE/Proto2 LTR/Pao DNA/Harbinger LINE/L2 LTR/Ngaro DNA/Ginger LINE/Rex-Babar LTR/DIRS DNA/Crypton LINE/CR1 RC/Helitron DNA/CMC DNA/Academ Unknown 0.0 0 5 10 15 20 25 30 35 40 45 50 Kimura substitution level B  Rio Pearlfish hatching enzyme genes C D Eye    Eye  BHGCs   BHGCs PHGCs Downloaded from https://academic.oup.com/g3journal/advance-article/doi/10.1093/g3journal/jkac045/6533448 by guest on 22 February 2022 % genome 3.0 4 SINE/MIR