Genome Assembly

Downloadable File
Organism: 
Haliotis rufescens (Red abalone)
File: 
Program, Pipeline Name or Method Name: 
MaSurCA
Program, Pipeline or Method version: 
3.2.2
Source Name: 
Halruf.fasta
Time Executed: 
08/28/2018
Materials & Methods (Description and/or Program Settings): 
Raw data were quality checked with FastQC (Andrews 2010) prior to assembly. Trimming for quality and adapters was performed with Quorum (Marçais et al. 2015), which is built into the MaSurCA assembly pipeline. An initial genome assembly was generated with MaSuRCA version 3.2.2 (Zimin et al. 2013), using paired-end (75x coverage), mate-pair (148x coverage) reads from both male and female samples, and PacBio (29x coverage) reads generated for the female sample. The following parameters were set apart from default: jellyfish hash size (JF_SIZE=20000000000), paired-end insert size and standard deviation (250, 50), and mate-pair insert size and standard deviation (15000, 1000). This initial assembly was scaffolded using long-range distance information obtained from Chicago in vitro proximity ligation libraries (7x expected coverage) with the proprietary HiRise2 program version v2.1.2-ad17ecf8bf57 (Dovetail, Santa Cruz, CA; Putnam et al. 2016). Scaffolds less than, or equal to, 150bp were removed. Contamination was assessed using Blobtools (v0.9.19, Laetsch and Blaxter 2017) with default parameters, and MegaBLAST version 2.6.0 with an upper e-value threshold of 1e-5 (Zhang et al. 2000) to the NCBI nr/nt database downloaded on Sep 17, 2016 (Supplemental Note 3). Synteny between Haliotis rufescens (green) and Haliotis discus hannai (blue) for Figure 2 was visualized using Circos (Krzywinksi et al. 2009). To obtain syntenic relationships the following steps were performed: 1) GMAP (Wu et al. 2010) was used to map H. rufescens genes to the H. discus hannai genome downloaded from http://gigadb.org/dataset/100281, 2) Opscan (Drillian et al. 2014) with fastp (Chen et al. 2018) and global alignments as inputs was used to generate ortholog families between the two gene sets, with only primary alignments considered in H. discus hannai, and 3) i-ADHoRe 3.0.01 (Proost et al. 2011) was used with the following parameters: prob_cutoff=0.001, level 2 multiplicons only, cluster_gap=20, gap_size=15, q_value=.05, and minimum of 3 anchor points to generate the multiplicons.