Skip to main content

Genome sequences of four Ixodes species expands understanding of tick evolution

Abstract

Background

Ticks, hematophagous Acari, pose a significant threat by transmitting various pathogens to their vertebrate hosts during feeding. Despite advances in tick genomics, high-quality genomes were lacking until recently, particularly in the genus Ixodes, which includes the main vectors of Lyme disease.

Results

Here, we present the genome sequences of four tick species, derived from a single female individual, with a particular focus on the European species Ixodes ricinus, achieving a chromosome-level assembly. Additionally, draft assemblies were generated for the three other Ixodes species, I. persulcatus, I. pacificus, and I. hexagonus. The quality of the four genomes and extensive annotation of several important gene families have allowed us to study the evolution of gene repertoires at the level of the genus Ixodes and of the tick group. We have determined gene families that have undergone major amplifications during the evolution of ticks, while an expression atlas obtained for I. ricinus reveals striking patterns of specialization both between and within gene families. Notably, several gene family amplifications are associated with a proliferation of single-exon genes—most strikingly for fatty acid elongases and sulfotransferases.

Conclusions

The integration of our data with existing genomes establishes a solid framework for the study of gene evolution, improving our understanding of tick biology. In addition, our work lays the foundations for applied research and innovative control targeting these organisms.

Background

Ticks are one of a few groups of arthropods that have independently evolved a unique lifestyle of blood-feeding on vertebrates. Present in most terrestrial ecosystems, they represent a threat to companion and farm animals and to humans by transmitting diverse pathogens and parasites [1]. For example, ticks in the genus Ixodes transmit the spirochetes that cause Lyme borreliosis, which is the most common tick-borne disease in the Northern Hemisphere.

Ticks evolved more than 250 million years ago [2] and belong to the Parasitiformes, which together with the Acariformes form the Acari (ticks and mites), in the subphylum Chelicerata. Although the Parasitiformes and Acariformes are both monophyletic, the monophyletic status of the Acari has been debated and remains difficult to resolve [3,4,5,6,7,8]. Ticks themselves clearly form a monophyletic order (Ixodida), which comprises three families, the Ixodidae (hard ticks), Argasidae (soft ticks), and Nuttalliellidae [2, 9,10,11]. The Ixodidae are further subdivided into the Prostriata, which contains the genus Ixodes, and the Metastriata, which includes several tick genera.

Ticks have evolved specialized traits to thrive as blood-feeders, relying on molecular interactions with their hosts. These interactions occur primarily at two points: the feeding site on the host’s skin [12] and the tick’s midgut, where blood is digested. Tick saliva compounds aid attachment by neutralizing the host’s immune and clotting responses [13], while the midgut digests the host blood components. These interfaces drive selection on tick genes. Additionally, long periods between blood meals expose ticks to various environmental pressures, both biotic and abiotic, further shaping their genetic repertoires.

Gene duplication is believed to shape the major innovations in tick biology, and the duplicate genes would facilitate the evolution of the metabolic potential of these organisms [14]. To understand the evolution of tick gene repertoires and the importance of tick-specific duplications in particular, a comprehensive comparative study of tick genomes is necessary, which is now possible thanks to the growing number of available genome sequences both in ticks and in other Chelicerata. In comparison with insects, tick genomics has developed late, due to the relatively large genome sizes in this group (several times larger than most insects) [15]. The first tick genome sequence was published for Ixodes scapularis in 2016 [16], followed by the genomes of six other tick species, including Ixodes persulcatus and five species belonging to a monophyletic group of non-Ixodes hard tick species, known as the Metastriata [17]. Two high quality genome sequences of I. scapularis have been published recently [18, 19] that used the newer generation of long-read high-throughput sequencing.

The purpose of our study was to improve our knowledge of tick genomics, especially in the genus Ixodes which includes some of the most important vectors of tick-borne disease in Europe, North America, and Asia. We therefore generated the genome sequences and gene catalogs of four species, I. ricinus, I. pacificus, I. persulcatus, and I. hexagonus. The first three of these species, along with I. scapularis, are closely related and form the “Ixodes ricinus species complex,” while I. hexagonus serves as an outgroup for comparative purposes [20,21,22]. We chose to sequence the genomes of I. ricinus, I. pacificus, and I. persulcatus due to their role as primary vectors of the Lyme disease pathogen in their respective geographic regions. Together with I. scapularis, these genome sequences should provide a more complete view of the genetic landscape of this vector group. Beyond Lyme disease, these species also transmit various pathogens and are usually the most widespread ticks in their areas. Understanding their genomics will aid in better management of related health risks.

The following species have distinct distributions: I. scapularis is found in the eastern USA, I. pacificus along the Pacific coast of USA, I. persulcatus across Eurasia, and I. ricinus throughout Europe. This geographic pattern suggests a case of vicariant speciation, where species diverged from a common ancestor after becoming isolated in different regions, but have retained similar ecological traits. A comparable example is observed in the tick genus Hyalomma [23]. Our comparative analysis provides novel insights into the key evolutionary forces shaping the genomes of Ixodes species and ticks more broadly. We highlight gene families that have evolved or expanded uniquely within ticks, distinguishing them from other chelicerates, and identify variations in evolutionary patterns between different tick lineages.

Results

Genome assembly and annotation

To enhance our understanding of the genomic characteristics of ticks, particularly in the genus Ixodes, we sequenced and assembled the genome of four species: I. ricinus, I. pacificus, I. persulcatus, and I. hexagonus. The genome sequencing for these four species involved the use of linked reads (10X Genomics library sequence with Illumina short-reads), and for I. ricinus, we also incorporated Hi-C libraries to achieve chromosome-level assembly.

In the case of I. ricinus, we identified fourteen major scaffolds corresponding to 13 autosomes and the X sex chromosome, which totaled 93% of the total assembly (Hi-C map of interaction, Fig. 1A). This result is consistent with the established haploid chromosome count of 14 for this species, the same as in I. scapularis [24]. We therefore assume that the 14 largest scaffolds of the I. ricinus genome, accounting for 95.2% of the assembly size and 98.2% of the predicted genes, represent these 14 chromosomes. By comparison, the 14 largest scaffolds of the I. scapularis genome [18] represented 87.0% of the assembly size and 88.2% of the predicted genes, indicating a slightly more fragmented assembly. The genome assemblies of the other three species were organized by aligning them with the chromosomal structure of I. ricinus. Standard metrics of the four Ixodes genome assemblies sequenced in the present study are given in Table 1.

Fig. 1
figure 1

Continuity of the I. ricinus genome assembly and synteny with the I. scapularis genome. A Hi-C map of interactions for the I. ricinus genome assembly, showing 14 major scaffolds. The x and y axes give the mapping positions of the first and second read in the read pair respectively, grouped into bins. The color of each square gives the number of read pairs within that bin. Scaffolds less than 1 Mb are excluded. B Synteny between the genomes of I. scapularis and I. ricinus. Only the 15 largest scaffolds are represented for both species (the haploid chromosome number being 14). Horizontal bars represent the major scaffolds of each genome, while syntenies between the two species are indicated by identical colors

Table 1 Metrics of genome assemblies and gene catalogs of the four Ixodes species sequenced in this study and of the genome of Ixodes scapularis (De et al. 2023). For I. ricinus, gene metrics were obtained after manual curation (OGS1.3) of the initial annotation (OGS1.0). Assembly metrics are both after (scaffolds) and prior to (contigs) HiC scaffolding (for I. ricinus) or RagTag anchoring (for I. persulcatus, I. pacificus, I. hexagonus)

The four genome assemblies were annotated, and genes were predicted by using homologies with proteins of closely related species and RNA-Seq data. Manual curation was performed exclusively for I. ricinus, resulting in the Official Gene Get (OGS) 1.1, which resulted in the correction of 1569 genes (Additional File 1: Table S1). The most notable change was the prediction of 500 entirely new gene models (Additional File 1: Table S2). The completeness (% of complete BUSCOs) of the four new gene catalogs generated in this study fell within the range of recently sequenced tick genomes as shown in Table 2. Completeness was lowest in I. pacificus (81%), and highest in I. ricinus and I. hexagonus (about 91%), which is somewhat lower than the 98% observed for the recently improved genome of I. scapularis [18]. For I. pacificus, we also note a relatively high percentage of “duplicated” genes in the BUSCO analysis, suggesting that heterozygosity might have not been fully resolved and that our assembly still contains duplicate alleles, which is supported by the higher heterozygosity estimate for this genome (Additional File 2: Fig. S1). Finally, two tick genomes from another study [17], Hyalomma asiaticum and Haemaphysalis longicornis, showed significantly lower completeness (65% and 56% of complete BUSCOs, respectively), whereas the completeness for I. persulcatus in that study was lower compared to our study (79.6% versus 88.0%). For subsequent analyses involving I. persulcatus, we therefore utilized our assembly.

Table 2 Completeness of the species of Chelicerata included in our comparative study. The four Ixodes genomes sequenced in this study are in bold character. The search of conserved genes was made using BUSCO with the Arachnida odb_10 database (search of 2934 conserved genes). Group (ticks or other Chelicerata) and species name, numbers (five next columns), or percentages (five last columns) of different categories of BUSCO genes, as detailed in the headers

Transposable elements in ticks

In I. ricinus, I. hexagonus, I. pacificus, and I. persulcatus, repeated elements represented between 59.4% (I. ricinus) and 69.2% (I. hexagonus) of the genome, the majority being transposable elements (Table 3). Most of the transposable elements (TEs) identified are retroelements (~ 34% of the total elements, covering ~ 30% of the tick genomes). The most abundant TEs found in these tick species were long interspersed nuclear elements (LINEs), with ~ 2.1 M elements on average (~ 21% of the elements identified). Interspersed sequences represented only 9.5% of the genome of Amblyomma maculatum [25], whereas they accounted for over 50% of the genome of all Ixodes ticks. However, the relative frequencies of each TE family were similar between the genomes of A. maculatum [25] and I. scapularis [18]. For example, the Gypsy/DIRS family in the long terminal repeat (LTR) class has one of the highest coverages in both A. maculatum [25] and I. scapularis [18, 19] and represent ~ 9.5% of the genome in the four Ixodes species sequenced in the present study. Interestingly, Bov-B LINE retrotransposons were found in our tick genomes: 1921 elements in I. persulcatus, 1514 in I. ricinus, 2917 in I. pacificus, 1600 in I. hexagonus (see the “Discussion” section).

Table 3 Repeated elements in the genomes of the four Ixodes species (I. ricinus, I. hexagonus, I. pacificus and I. persulcatus)

Macro-syntenies between hard ticks

The genomes of I. ricinus and I. scapularis were found to be largely syntenic (Fig. 1B). Comparison of these two tick species suggests very little gene shuffling (homologous genes remained in the same blocks). However, the length of the largest scaffolds (chromosomes) varied between the two species and their ranking in size was slightly different. These differences may be due to different amounts of repeated elements and/or to the state of assembly of these elements. We note that the sequence NW_0240609873.1 representing the 15th largest scaffold in I. scapularis was included in scaffold 7 of I. ricinus. Conversely, scaffold 15 from I. ricinus matched with NW_0240609883.1, the 8th largest scaffold from I. scapularis). In both species, the smaller scaffolds that were not included in the 14 putative chromosomes were relatively gene poor and could correspond to regions with a high proportion of repeated elements, which are difficult to assemble. The plot comparing the two genome assemblies (Additional File 2: Fig. S2A) indicated several inversions (especially for scaffolds 1 and 6 of I. ricinus). It was not possible to determine whether these inversions are real or the result of post-assembly errors. The comparison between I. ricinus and Dermacentor silvarum also revealed the correspondence of major scaffolds between the two assemblies (Additional File 2: Fig. S2B). Dermacentor silvarum has 11 major scaffolds, which corresponds to its chromosomal number. Despite a low level of micro-synteny, there was a substantial proportion of shared content in the chromosomes, with eight exact matches between chromosomes of these two tick species. In addition, scaffold NC_051157.2 of D. silvarum had non-overlapping matches with scaffolds 3 and 14 of I. ricinus, D. silvarum scaffold NC_051154.1 matched with scaffolds 6 and 12 of I. ricinus, and D. silvarum scaffold NC_051155.1 matched with scaffolds 7 and 11 of I. ricinus. Thus, depending on the ancestral karyotype, there were only three chromosome fission or fusion events in the two branches leading from a common ancestor to the two extant species, after which macro-synteny remained remarkably stable. In the two comparisons (I. ricinus versus I. scapularis and I. ricinus versus D. silvarum), we did not observe evidence of multiple regions from one species matching two different regions from the other species. This indirectly suggests that no large-scale duplication events occurred in a common ancestor of ticks.

Gene families in ticks and the Chelicerata

An analysis of 497,214 protein sequences from 21 Chelicerata species, including 11 tick species, identified 11,331 gene families containing 332,365 protein sequences (Additional File 1: Table S3). Notably, some gene families exhibited significant differences in gene abundance even among closely related tick species. These gene families, which showed differential gene counts, were associated with gene ontology (GO) terms related to transposable elements. For instance, gene family FAM0000061 contained 6,896 genes in the spider Trichonephila clavata, yet was absent in some Acari genomes. In I. scapularis, this family had 1266 genes, while I. ricinus had only 113 (Additional File 1: Table S4). Differences in gene annotation strategies, such as masking of repetitive regions, may influence the detection of transposable elements. Therefore, for subsequent gene family evolution analyses, we excluded gene families associated with typical transposon domains (e.g., reverse transcriptase, transposase, recombination-activating gene) or those with atypical size variation (where the gene count in I. scapularis was more than five times that in I. ricinus). After filtering out these TE-related families, the remaining distribution of gene families is shown in Fig. 2A. For example, 620 families were conserved across all 21 species, while 139 families were shared among the 11 tick species but absent from other species.

Fig. 2
figure 2

Evolutionary dynamics of gene families in ticks. A Distribution of gene families in ticks and other Chelicerata. The top bar plot represents the number of families shared in a given intersection; the left bar plot gives the number of families per species. Species were ordered according to their phylogeny (right tree), and intersections with a phylogenetic relevance are indicated in orange. Tick (Ixodida) species are highlighted in green. B Phylogenetic tree of Chelicerata, based on complete genomes. The tree was built by IQ-TREE 2 using a concatenation of 107 single-copy protein sequences, shared by all represented species. Branch support is shown by bootstrap values and Shimodaira-Hasegawa approximate likelihood ratio test (SH-aLRT) values. C Gene expansion and contraction dynamics in Chelicerata, analyzed with CAFE. The expansion rate per node is given by the size and the color of the points. The number of expanding ( +) or contracting ( −) gene families for each node is in blue and above the branches. The number of new families per node is in green. The tree was built by IQ-TREE 2 using 107 protein sequences, before being transformed into an ultrametric tree using phytools and ape R packages

Phylogeny based on single copy orthologs

Our phylogenetic tree based on 107 single-copy orthologs (Fig. 2B) showed high support for the Acari (Parasitiformes and Acariformes) being a monophyletic group. Whether the Acari are a monophyletic group has been debated in the recent literature [5, 7, 26]. This question, which was not central to our study, will need more complete sequence data to be fully resolved, especially regarding taxon sampling and filtering of sites/genes. The grouping of the Mesostigmata with the Ixodida, and the monophyletic grouping of ticks were both strongly supported, confirming all previous phylogenetic analyses, and this was the main justification for our comparison of gene family dynamics in ticks. Within the Ixodidae (hard ticks), the phylogenetic relationships in our study are consistent with recent studies based on transcriptomes [20] or whole genomes [17]. Our study confirms that the group of four species belonging to the Ixodes ricinus species complex are very close genetically. Within the genus Ixodes, an analysis based on a higher number of shared sequences (Additional File 2: Fig. S3) showed that I. ricinus and I. persulcatus are sister clades, as are I. scapularis and I. pacificus. The first and second species pairs are found in Eurasia and North America, respectively, which suggests a pattern of phylogeographic divergence. We note however that these four species diverged at nearly the same time.

Structure of genes, importance of mono-exonic genes

As many as 24.1% of genes in I. ricinus were predicted to be mono-exonic (Additional File 1: Table S5). In Ixodes tick species, the percentage of mono-exonic genes varied between 20.8 and 30.5%—the lowest proportion was found in I. scapularis, suggesting an effect of annotation strategies, intronless genes being often less well predicted by annotation pipelines—while lower proportions of mono-exonic genes were found both in the Dermacentor and Rhipicephalus genera (~ 15%) and in Mesostigmata (Acari: Parasitiformes) − 10 and 15%. This result is interesting because the frequency of mono-exonic genes is usually low in eukaryotic genomes. Ixodes ticks thus appear to have specifically undergone duplications processes generating mono-exonic copies, either as the result of new retroposition events or of tandem duplications of monoexonic genes (as in the example of cytosolic sulfotransferases, see below). The percentage of mono-exonic genes was high in gene families tagged as putative TEs (35% on average in TE-like families containing > 10 genes in I. ricinus—ranging between 0 and 86%), but also in other gene families (18% for non TE-like families containing > 10 genes in I. ricinus), again with a large range (0% to 100%). Some of the gene families with a high percentage of mono-exonic genes corresponded to the most expanded gene families in ticks, such as the fatty acid elongases (FAM002111, 82%), cytosolic sulfotransferases (FAM000226, 52%), serpins (FAM001806, 45%), and M13 metallopeptidases (FAM000666, 34%), suggesting that a proliferation of mono-exonic genes contributed to tick-specific gene family expansions. However, other large gene families had few or no mono-exonic copies, such as the major facilitator superfamily (MFS) transporters (FAM000149).

Dynamics of gene family expansions in ticks

To analyze gene gain/loss dynamics across Chelicerata species, we ran CAFE5 (v5.0), using a lambda of 0.451 predicted from a first round of base model. This program filtered gene families not present at the root of the tree and retained 4525 gene families, of which 497 gene families were found to be either significantly expanded or contracted during the evolution of the Chelicerata. Gene family expansions and contractions were quantified on each branch of the phylogenetic tree of Chelicerata (Fig. 2C and Fig. 3).

Fig. 3
figure 3

Enriched gene ontology terms (GOs) in gained and expanded families during the evolution of ticks. A phylogenetic tree of the tick species (extracted from the complete phylogenetic tree of Chelicerata). The “non Ixodidae” clade refers to the Metastriata species. The “ricinus group” is a group of closely related Ixodes species. B and C show the most represented Gene Ontology terms associated with biological processes in gained and expanded families, respectively

Expanded gene families in the Ixodidae included were involved in lipid metabolism and xenobiotic detoxification (Fig. 3C). The principal GO terms for molecular functions associated with tick expansions were heme binding, transferase, hydrolase, oxidoreductase, metalloendopeptidase/protease, and transmembrane transporter activities. The list of the significantly expanded gene families in ticks (Additional File 1: Table S6) shows that some of the most important expansions during the evolution of ticks concern genes associated with detoxification processes, for example cytosolic sulfotransferases (SULTs), carboxylesterases, and glutathione S-transferases (GSTs—see below for more detailed analyses). Other important gene family expansions include genes known for their importance in tick metabolism such as metallopeptidases or serpins. Other gene family expansions were unexpected and not previously described, such as acylcoA synthases and fatty acid elongases.

The number of gene families gained per branch was also estimated by summing all gene families present in a specific clade but absent from species outside of this clade. We identified 406 gene families that were gained in the common ancestor of hard ticks (i.e., these gene families were absent in all other Chelicerata, Fig. 3B). These results must be interpreted with caution; we noted that the largest gene family in this category, FAM007521, contains genes annotated as tumor necrosis factor receptor-associated factors (TRAFs), which are widespread in the Metazoa. Tick genes from this family show a low level of conservation since best hits have ~ 25% identities in BlastP comparisons with non-tick organisms. This suggests that this gene family contains highly divergent genes and would explain why this gene family did not have any potential orthologs in the Chelicerata.

The following sections represent detailed analyses on a selection of gene families that have been expanded in the common ancestor of hard ticks (Ixodidae). These families were chosen for this focus either because they represent the largest expansions or because of previous knowledge on their importance in tick biology. Additional analyses on structural and regulatory non-coding elements, neuropeptides, M13 metallopeptidases, Kunitz-domain proteins, cytochrome P450s, carboxylesterases, and ATP-binding cassette transporters can be found in a supplementary section (Additional File 3: Supplementary Results) [27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45].

Tick cystatins

The cystatins are a family of cysteine protease inhibitors. Iristatin has been characterized as a secreted immunomodulator from tick salivary glands found at the tick-host interface [46]. We found independent expansions of cystatins in I. scapularis (ISCP_027970 clade) and I. hexagonus (Ihex00005714 clade) and expansions of iristatins in both I. pacificus and I. ricinus (Additional File 2: Fig. S4). Cystatins and iristatins have a typical three-exon structure and are principally clustered in two genomic regions on scaffold 3 and scaffold 9, respectively (Additional File 1: Tab. S7). Cystatins correspond to family FAM006825 in the SiLiX analysis, which expanded significantly in the common ancestor of hard ticks, but not in branches deriving from this node (Additional File 1: Table S6). Iristatins have high expression either in hemocytes, or in ovaries, Malpighian tissues, salivary glands, and synganglion respectively (Additional File 2: Fig. S5).

Serpins

Serpins are protease inhibitors involved in the regulation of many physiological processes in vertebrates and invertebrates [47] and even viruses [48]. In ticks, serpins are salivary components responsible for anti-hemostatic, anti-inflammatory, and immunomodulatory effects in the vertebrate host [49]. Serpins from I. ricinus can be divided into two groups, Iripins (I. ricinus serpins) and IRIS-like serpins, which refers to the first described tick serpin IRIS [50]. Iripins bear signal peptides and are secreted from the cell, whereas IRIS-like serpins are most likely intracellular or secreted non-canonically. A total of 61 Iripins and 21 IRIS-like serpin sequences were found in the genome of I. ricinus. Gene expression differed among tick tissues and the number of exons ranged from 1 to 7 (Additional File 1: Table S8). Serpins were generally classified in the same SiLiX family (FAM001806), which was significantly expanded in the common ancestor of hard ticks. Our phylogenetic analysis found several clades of Iripins and one clade of IRIS genes (Fig. 4B), in agreement with a previous phylogenetic study [48]. Several serpins form clusters of closely related genes in the genome, suggesting they have arisen through tandem duplication (e.g., a cluster of 22 Iripins within a region of 600 Kbp on scaffold 14 of the I. ricinus genome, whereas most IRIS serpins form a cluster on scaffold 9). Mono-exonic genes were common in the Iripins (53% were intronless), whereas three other Iripins only had two exons, the first one being 5′ untranslated region (UTR) only. This gene structure suggests initial events of retroposition, followed by re-exonization of some genes, in addition to tandem duplication. Some serpins are expressed constitutively, while others are upregulated or downregulated by the blood meal (Fig. 4A). Many serpin genes, especially mono-exonic ones, had no or negligible levels of gene expression, which could explain why they were not found in transcriptomic studies. By contrast, the most highly expressed serpins, such as Iripin-01, Iripin-02, Iripin-03, Iripin-05 and Iripin-08 and IRIS-1, have been previously studied and characterized as anti-hemostatic, anti-inflammatory, and/or immunomodulatory proteins [50,51,52,53,54,55,56].

Fig. 4
figure 4

Evolution of serpins in the tick I. ricinus. A Serpin expression profile. The expression heatmap is based on log10(TPM) (transcripts per million) calculated for the respective transcriptomes: SYN, synganglion; SG, salivary glands; OV, ovary; MT, Malpighian tubules; MG, midgut; FB.T, fat body/trachea; UF, unfed females; F, fully fed females; WB, whole body. B Consensus phylogenetic tree of serpins from I. ricinus. Sequences were aligned as proteins; signal peptides and variable reactive center loops were removed before the analysis as well as non-informative positions. Edited protein sequences were analyzed by maximum likelihood method and JTT matrix-based model, and bootstrap method with 1000 replications was used to calculate the reliability of tree branches. Only branches with bootstrap value equal or higher than 50% are shown. Mono-exonic serpins are shown with an orange dot. Specific clades are represented by colored areas in the phylogenetic tree, using the same background color for sequence labels in the heatmap

Incomplete heme pathways and heme-independent iron inter-tissue trafficking

Heme is an essential molecule for living organisms, involved in multiple processes, and necessary for successful reproduction in ticks [57]. The I. ricinus genome only contains genes coding for the last three enzymes of the heme biosynthetic pathway: cpox, ppox, and fech, which code for coproporphyrinogen oxidase, protoporphyrinogen oxidase, and ferrochelatase, respectively. Ticks from the Metastriata group have lost the cpox gene (Fig. 5, Additional File 2: Fig. S6). Finally, soft ticks in the genus Ornithodoros have cpox, ppbox, and fetch but also carry the conserved genes pbgs and urod, which code for porphobilinogen synthase and uroporphyrinogen decarboxylase (Additional File 1: Table S9 and Additional File 2: Fig. S6). The absence of several genes in the heme pathway strongly suggests a loss of heme biosynthetic activity in all ticks, implying they only obtain haem from the blood meal, which agrees with previous studies on I. ricinus and other tick species [17, 57,58,59]. The heme pathway genes, cpox, ppox, and fech, have tissue-specific patterns of expression [57], suggesting functional transcripts. The function of these three proteins remains elusive, but the retention of their mitochondrial target sequence indicates a function in mitochondrial biology (Fig. 5). Finally, no sequence homologous to heme oxygenase was found in ticks, nor in other Chelicerata, as shown by the metabolic pathway reconstructed for porphyrin metabolism (Additional File 2: Fig. S6).

Fig. 5
figure 5

Genomic prediction for heme and iron biology. A Gene loss in the heme biosynthesis pathway in the genome of I. ricinus. The green color indicates the presence of the homologous gene in the I. ricinus genome, with predicted mitochondrial targeting of their protein products (DeepLoc8 prediction values in purple are shown below the enzyme name). B Two ferritin genes have been identified in the I. ricinus genome: ferritin 1 contains 5′ UTR iron-responsive element with the “head” part of the stem-loop structure and complementary bases forming the stem (the blue inset), while ferritin 2 contains a signal peptide (the orange inset) with high SignalP9 probability (shown above the inset). The 3-D reconstruction confirms the conservation of monomeric folding and assembly towards a 24-mer multimers of > 10 nm in external diameter

Ferritins are an essential actor of iron metabolism in ticks. Unlike in other arthropods, the absence of heme oxygenase in ticks means that iron homeostasis is indeed decoupled from haem homeostasis and is only ensured by these cytosolic proteins that store iron. Our study confirms previous findings that ticks genomes contain two ferritin genes [60] that respectively encode for intracellular Ferritin 1 (without a signal peptide) and secretory Ferritin 2 (with a signal peptide). The sequence of the ferritin 1 gene is preceded by a regulatory region called the iron responsive element (IRE) [61] (Fig. 5). Both tick ferritins fold into conserved four-helix bundle monomers that self-assemble into higher-order 24-homomeric structures (Fig. 5). The mechanisms of cellular export of tick Ferritin 2 nanocages [62] or their uptake by peripheral tissues remains unknown. The presence of two types of ferritin, allows ticks to store intracellular iron in tissues using Ferritin 1, and traffic non-haem iron between tissues using Ferritin 2 [63]. Binding of the iron regulatory protein to the IRE prevents expression of Ferritin 1 under low iron conditions [57, 60]. While these ferritin cages are often several nanometers in diameter, the secretory Ferritin 2 was identified within larger (~ 100 nm) hemolymphatic extracellular vesicles in Rhipicephalus haemaphysaloides and Hya. asiaticum ticks [64].

Gustatory receptors and ionotropic receptors for chemosensation

Non-insect arthropods, including ticks, use two major gene families for chemosensation, gustatory receptors (GRs) [65] and ionotropic receptors (IRs) [66]. The IRs are involved in the perception of both volatile odorants and tastes [67]. Ticks possess olfactory receptors that are part of the IR family and gustatory receptors that are part of either the IR family or the distinct GR family. Insects possess a third family of chemoreceptors, odorant receptors (ORs), which probably evolved in ancient insects from GRs [68]. While ticks seem to lack homologs of the insect ORs, their IRs and GRs are related to the insect IRs and GRs, respectively [69]. Arthropod chemosensory receptors are characterized by high sequence divergence, gene duplication, and gene loss [65, 70,71,72], and thus require careful curation, as performed here for I. ricinus.

The I. ricinus genome contains 159 IR and 71 GR genes, whereas 125 IR and 63 GR genes have been described for I. scapularis [73] (Additional File 1: Table S10). For I. ricinus, 65 of the 90 full-length IRs (72.2%) and 10 of the 51 full-length GRs (19.6%) were intronless. Acari GRs are highly divergent including the most functionally and genetically conserved GRs in insects, such as the receptors that detect sugar, fructose, carbon dioxide (CO2), and bitter taste [74,75,76]. We identified three clades of GRs in the Acari (Additional File 2: Fig. S7A), one clade was specific to the mite Galendromus occidentalis, another to ticks, while a third clade contained sequences from both mites and ticks. Several of the I. ricinus gene copies in this third clade were intronless, suggesting retroposition events. The fact that the G. occidentalis homologs are all multi-exonic in this clade suggests that the multi-exonic status was ancestral and that independent retroposition events occurred repeatedly in I. ricinus. For both GRs and IRs, we most often identified co-orthologs between I. ricinus and I. scapularis, although some expected orthologs were absent, or there were specific amplifications (e.g., there were 6 copies of a gene in I. ricinus that was co-orthologous to IscaGR13F in I. scapularis). We note that these differences may result from incomplete annotation in either species.

The phylogenetic study of IRs also found several clades based on gene structure and conservation (Additional File 2: Fig. S7B). A large clade of sequences was found only in Ixodes spp., all of which were intronless. A second large clade contained sequences from both the mite G. occidentalis and Ixodes spp. (all multi-exonic). Another 5 IR sequences of I. ricinus were found in clades more conserved between insects and Acari. In a separate phylogenetic analysis, we compared only these “conserved” IRs between insects and Acari (Additional File 2: Fig. S8), which allowed us to identify orthologs in I. ricinus for two IR coreceptors, IR25a and IR93a, which are conserved in all arthropods. Interestingly, IR25a and IR93a were co-linear in the genomes of arachnids, such as I. ricinus, I. scapularis, G. occidentalis (mite), and Argiope bruennichi (spider), but not in Lepidopteran insects, such as Spodoptera littoralis [77] and Heliconius melpomene [78] (not shown). We also found a pair of closely related sequences, IR101 and IR102, which cluster with the IR93a and IR76b clades (being basal to this group of sequences) and could therefore function as co-receptors in ticks. Finally, another sequence, IR103, was distantly related to the D. melanogaster IR proteins that detect humidity, heat, ammonia, or amines [79,80,81,82]. These stimuli are known attractants for hematophagous (blood feeding) arthropods and are potentially important for the biology of I. ricinus. We note that homologs of IR103 in G. occidentalis have undergone multiple duplications. Due to their fast evolution, our SiLiX clustering divided the IR and GR families into several gene families, and some of these contained only tick sequences. This was especially the case for the more divergent mono-exonic clades. However, the majority of multi-exonic IR sequences were contained in one gene family (FAM000240), which was classified as significantly expanded in the common ancestor of hard ticks. In summary, we found strong evidence of expansion of GR and IR genes in ticks. Detailed lists for GRs and IRs are given in Additional File 1: Tables S11 and S12, respectively.

Defensins

Defensins are small cationic antimicrobial peptides (AMPs) that are part of the innate immune system in both hard and soft tick species [83, 84]. A characteristic feature of these secreted effector molecules is their small molecular size of about 4 kDa and the conserved pattern of six cysteine residues forming three intrachain disulfide bridges [85]. We identified 14 genes encoding prepro-defensins 1–14 (def1-14) and 8 defensin-like peptides (DLP1-8). Most prepro-defensins (def1-12) were located in a cluster (range 28 Mbp) on scaffold 7, while the two remaining sequences (def13-14) were on scaffold 9, and adjacent. For the DLPs, 3 sequences (DLP1-3) were located on scaffold 7, in the same cluster as def1-12, while the other DLPs were located on scaffold 6 (Additional File 1: Table S13). This presence of clusters of genes and the absence of intron-less copies suggests repeated tandem duplications without retroposition events, unlike other gene families (Additional File 2: Fig. S9). Ortholog groups within Ixodes species were not easy to determine from our phylogeny because there were many “gaps” (subclades where one or more of the five Ixodes species were missing), which could be explained by either gene loss/expansion or incomplete annotation (an increased risk for these very short peptides, with a three- or four-exon structure). The predicted mature peptides of DLP1 and DLP2 have only four conserved cysteine residues, while DLP3 encodes a mature defensin with an extended C-terminal portion, and the DLP8 gene sequence is longer due to an internal insertion (Fig. 6A). The sequences of DLP4-7 lack furin cleavage motifs, but both canonical defensins and DLPs contain an N-terminal signal peptide, indicating that they are secreted peptides. Defensins def1 and def3 (which have identical amino acid sequences) were by far the most expressed genes in this family, especially in the hemocytes and midgut of fed ticks (Fig. 6B). Most defensins were more expressed in the hemocytes than in other tick tissues.

Fig. 6
figure 6

Multiple sequence alignment and tissue expression heatmap of identified I. ricinus prepro-defensins and defensin-like peptides. A Multiple amino-acid sequence alignment of identified prepro-defensins (def1-def14) and defensin-like peptides (DLP1-DLP8). Highlighted in yellow—genes located on scaffold 7; in blue—genes located on scaffold 9; in green—genes located on scaffold 6; red letters—furin cleavage motif; red dashed frame—predicted mature peptides; #—conserved cysteine residues. B Tissue expression heatmap based on TPM (transcripts per million) in respective transcriptomes using log transformation log10(TPM). SYN, synganglion; SG, salivary glands; OV, ovary; MT, Malpighian tubules; MG, midgut; FB.T, fat body/trachea; UF, unfed females; F, fully fed females; WB, whole body

Tick gene repertoires for detoxification

Ticks, like other arthropods, have developed a variety of mechanisms to cope with a wide range of endogenous or exogenous compounds. Detoxification processes occur in three main phases [86]. In phase 1, toxic substances are chemically modified to make them more reactive—this typically involves cytochromes P450 (CYPs) and carboxylesterases (CCEs). In phase 2, metabolites are conjugated to hydrophilic molecules to facilitate their excretion—this mainly involves glutathione S-transferases (GSTs), UDP-glycosyltransferases (UGTs), and cytosolic sulfotransferases (SULTs). In phase 3, conjugated metabolites are transported out of the cells by cellular transporters, such as ATP-binding cassette (ABC) transporters, which use the energy from ATP breakdown to transport molecules across lipid membranes. CYPs and CCEs are involved in various physiological processes such as digestion, reproduction, behavioral regulation, hormone biosynthesis, xenobiotic detoxification, and insecticide and acaricide resistance [87,88,89,90]. Of note, in I. ricinus, transcripts for several phase 2 detoxification enzymes were shown to be regulated by ingested heme from the host blood meal [58].

The I. ricinus genome contains all components of the classical three-phase detoxification system. We did not find homologs for UGTs in the tick genomes, which agrees with the demonstrated loss of this gene family in the common ancestor of the Chelicerata [91]. A summary of gene counts is given in Additional File 1: Table S14 and detailed lists of genes are given for CYPs, CCEs, GSTs, SULTS, and ABCs (Additional File 1: Tables S15 to S19, respectively).

Glutathione S-transferases

A total of 49 GST sequences were identified for both I. ricinus and I. scapularis. Interestingly, more than half of the IricGSTs—including the heme-responsive GST, here named GSTD2/IricT00009278 [92]—are located on scaffold 2, a cluster already identified in I. scapularis [18]. This distribution of GSTs may reflect numerous clade-specific duplication events at the root of GST diversity. Our phylogenetic analysis indicated some patterns of lineage-specific duplication within ticks; for example, several duplications in the Zeta class were specific to the genus Ixodes, and several duplications in the Epsilon class were specific to the Metastriata (Additional File 2: Fig. S10A). Similar to the CYPs and CCEs, several clusters of GSTs were well characterized by distinct gene expression profiles (Additional File 2: Fig. S10B). The GSTs corresponded largely to FAM000927 in the SiLiX clustering, which was found to be significantly expanded in the common ancestor of hard ticks, and also in the common ancestor of the I. ricinus species complex, this GST family having more genes in these four Ixodes species compared to I. hexagonus or the Metastriata ticks.

Cytosolic sulfotransferases

The SULT family was found to be significantly expanded in both the common ancestor of hard ticks and in all internal nodes of the hard tick evolutionary tree (SiLiX family FAM000226, Additional File 2: Table S6). This suggests a continuous trend of gene family expansion. Our phylogenetic analysis (Fig. 7A) allowed us to distinguish three clades in the SULT gene family. Clades A and B contain sequences found in the Chelicerata and in other Metazoa including humans and houseflies. These two clades have few or no duplications in the Chelicerata. For clade A, only one tick sequence was identified (SFT-142), which was homologous to the four D. melanogaster sequences (dmel-St1-4) and to the human ST4A1 sequence. Clade B also contained a single tick sequence (SFT-7), which was homologous to three human sequences (ST1A1, ST2A1 and ST6B1), but no sequences from D. melanogaster. Clades A and B have high bootstrap support, so we tentatively propose that an ancient duplication in the common ancestor of arthropods and vertebrates gave rise to these two clades. This would have been followed by secondary gene duplications or gene loss (e.g., loss of the clade B ortholog for D. melanogaster). All other SULT sequences form clade C and are exclusively found in the Chelicerata. Several independent gene expansions occurred within clade C, especially in ticks. While SULT genes in clades A and B have a similar structure and a similar number of exons (6 or 7), the SULT genes in clade C have a much lower number of exons. Most of the SULT genes in clade C are indeed mono-exonic, or have only two exons, the first one often non-coding (i.e., entirely 5′ UTR). This suggests an initial retroposition event, which would have generated an intron-less copy, at the origin of clade C, and thus probably in the common ancestor of Chelicerata, followed by a process of re-exonization of some of the copies. The genomic organization of copies, with several clusters of adjacent copies that are closely related, but also multiple localizations of such clusters, suggests that the SULT expansion is the product of a combination of tandem duplications and retroposition events.

Fig. 7
figure 7

Expansion of the cytosolic sulfotransferases (SULTs) in the genome of ticks and other Chelicerata, with evidence for both retroposition events and re-exonization. A Phylogenetic tree of cytosolic sulfotransferases (SULTs). ML tree using the best-fit LG + I + G4 model of substitution. Label prefixes correspond to each species: the tick I. ricinus (SFT, labels in blue), the spider Parasteatoda tepidariorum (ptepid), the horse-shoe crab Limulus polyphemus (limulus), human and Drosophila melanogaster (dmel). Well supported nodes are indicated by dark-filled circles (the width of circles varies with bootstrap values, ranging between 0.85 and 1). The chromosomal localization (scaffold number) is indicated in the first outer circle. The next outer circle are bar-charts of the number of coding exons (dark-gray filling) and 5′ UTR-only exons (orange filling). The tree allows to define two “conserved” clades (A—green and B—red), with sequences shared between vertebrates and Chelicerata, and a clade (C—blue) with sequences exclusively in Chelicerata. B Heatmap of expression of SULTs in I. ricinus

With respect to gene expression (Fig. 7B), the conserved copies (SFT-7 and SFT-142) are preferentially expressed in the ovaries of half-fed ticks, whereas many of the other SULTs were dominantly expressed in either the Malpighian tubules of unfed ticks, in larvae, or in salivary glands, with a generally low level of gene expression for the strictly mono-exonic copies.

Discussion

Importance of transposable elements in tick genomes, and definition of cellular genes

Our analysis of transposable elements (TEs) shows that they constitute a large proportion of the genomes in four Ixodes tick species, which is typical of large eukaryotic genomes. Interestingly, we identified Bov-B LINEs in the genome of I. ricinus, a subclass of retrotransposons abundant in vertebrates, which have also been found in the genomes of reptile ticks, with evidence of horizontal gene transfer between vertebrates and ticks [93, 94]. Bov-B LINE sequences were also found in the tick genus Nuttalliella [95] suggesting a widespread presence in ticks and possible independent acquisitions of transposons from vertebrate hosts. A recent study of the variation in gene presence-absence in tick genomes showed that most de novo genes and genes that are not consistently shared between species are TE-related [96]. Due to the unique dynamics of TEs, it is common practice to separate them from the core genome. However, in our comparative study of the Chelicerata genomes, we found that many predicted cellular genes—often from large gene families—were, in fact, TEs. This was particularly evident in the genomes of I. scapularis and Trichonephila clavata. Following best practices in comparative genomics, we excluded these TE-related genes from our analyses. It is important to highlight that previous studies have shown that some predicted TE-related genes may correspond to domesticated genes in both tick genomes [25] and vertebrate genomes [97], supported by phylogenetic analyses and expression patterns. While this could explain a subset of the TE-related genes identified in our study, determining the presence of domesticated TEs was beyond the scope of our research. Future research is required to determine whether the significant variation in TE numbers across tick species (including closely related ones) is due to annotation methods or reflects genuine differences in transposon dynamics.

Mechanisms of duplication, importance of intronless copies

Gene duplication can result from different mechanisms, with both small-scale events, such as tandem duplications and retropositions, and larger-scale events, such as segmental duplications or whole genome duplication (WGD) [98, 99]. In the Chelicerata, previous studies have shown that WGD has occurred independently in the Xiphosura (horseshoe crabs) [100] and in the common ancestor of scorpions and spiders [101, 102], whereas there is currently no evidence of WGD in ticks. A previous study based on the genetic distances between paralogs [103] suggested that large-scale duplications may have occurred in ancestral ticks. However, the absence of duplicated Hox genes in ticks [102] and our analyses of macro-synteny between tick genomes both suggest that large-scale duplications have not occurred in tick genomes.

Our study identified several large gene families in ticks, often with physical clusters of closely related copies, suggesting that they arose through tandem duplications. Strikingly, many gene families also harbor a high percentage of intronless genes. Intronless copies result from the retroduplication of poly-exonic genes where the mRNA of the ancestral poly-exonic gene is reverse transcribed into DNA and then inserted back elsewhere into the genome [104, 105]. Retroposed genes were typically considered as secondary and accidental events, because they generate copies that lack regulatory sequences (and are therefore expected to be eliminated by selection) and because of their small number. However, it has been shown that these retroposed genes can re-acquire new exons, typically in the 5′ UTR, allowing the restoration of a promoter sequence [106, 107]. This mechanism could “stabilize” duplicated genes and facilitate their retention in the genome [108]. We observed this phenomenon in several gene families, indicating secondary exon acquisition in initially retroposed gene copies. Mono-exonic genes often had low or null levels of expression (with some exceptions), whereas the expression of genes with secondarily acquired introns was in the range of the multi-exonic copies. Thus, our study suggests a high gene duplication rate in tick genomes, driven by a combination of transposon-based retroposition events and tandem duplication. While most duplicate gene copies are probably destined for elimination (e.g., as shown by the many expressionless partial copies for the serpins or SULTs), some copies may be retained by selection and acquire new functions.

Functional importance of gene expansions

Gene duplication is a major force in the evolution of genomes and in the adaptative potential of organisms [105, 109]. Although most duplicated genes are eliminated, some new genes undergo sub-functionalization or acquire new functions [99, 110]. Ticks are no exception and are expected to show specific gene duplications linked with the constraints and particularities of their parasitic life-style. In hematophagous (blood feeding) arthropods, genes involved in host-parasite interactions and blood processing (e.g., salivary gland proteins and proteases) are more likely to show gene expansion [14, 111, 112]. Ticks possess multigene families involved in various physiological processes, such as detoxification of host molecules, evasion of host immune defenses, and sensory perception [14]. The co-evolutionary arms race between ticks and their vertebrate hosts exerts particularly strong selection on the tick genes involved in searching for a host, attachment to the host, and blood-feeding. Many of these host-tick interaction genes are expressed in the tick salivary glands, which produce a complex mixture of proteins that have anti-hemostatic, anti-inflammatory, and anti-immunity functions in addition to facilitating tick attachment and tick blood feeding. Thus, the duplication of the tick salivary gland genes contributes to the diversification of these functionally important genes in the frame of the co-evolutionary arms race with their hosts [113, 114].

Our comparative genomics analysis allowed us to identify important expansions of gene families in tick genomes. Genes associated with detoxification processes showed high expansions in tick genomes (especially in the common ancestor of hard ticks compared to the other Chelicerata. For example, the high number of GST genes identified in tick species may be related to specific adaptations during the rapid evolution of this group towards its modern avian and mammalian hosts [115] but may also compensate for the absence of UDP-glycosyltransferases in ticks [91]. In ticks and mites, several genes of the ABC sub-family C (ABC-C) have been shown to confer acaricide resistance [116, 117]. With 55 genes encoding for ABC-Cs, I. ricinus appears to be well equipped to evolve resistance to acaricides, although these genes are probably also involved in physiological processes. For example, the ABC transporter of R. microplus (RmABCB10) mediates the transport of dietary acquired heme across cell membranes, and is being studied as an acaricidal target [118].

We observed expansions of several gene families that encode metalloproteases, in particular for the M13 metalloprotease family. Metalloproteases are the most abundant protease class in ticks [119]; they are secreted in tick saliva at the tick host interface [120, 121] and are recognized as immunogens inherent to blood feeding [119, 121,122,123,124]. The M13 gene family was the second most-expanded gene family in ticks compared to other Chelicerata, with, interestingly, even larger numbers of gene copies in the non-Ixodes species of hard ticks (~ 130 in Rhipicephalus, and ~ 220 in Dermacentor than in Ixodes species (~ 90 genes)—confirming previous analyses by [17]. However, most of the members of the M13 gene family lack SignalP prediction and only some are expressed by the tick salivary glands. The very diverse combinations of conserved/substituted residues [30] among the whole collection of sequences present in the I. ricinus genome suggests a full spectrum of proteins with diverse functions, including neo-functionalized proteins, possible scavengers and inactive enzymes. The conservation of the zinc-binding motif is always coupled with a secretion signal peptide, and not with transmembrane domains. Also, proteins containing conserved active domains often have low expression in the tick tissues, suggesting that these proteins play discrete functions across multiple tissues. The functions of highly expressed, possibly inactive, M13 homologs in ticks are currently unknown.

The most spectacular gene family expansion concerned the family of cytosolic sulfotransferases (SULTs). The research on SULTs is relatively new compared to the research on detoxification enzymes like cytochrome P450s and UGTs [125]. The characterization of SULTs in model organisms like humans, zebrafish, and the house fly has shown the importance of these enzymes in the detoxification of exogenous compounds and in the sulfation of key endogenous compounds. The number of tick SULTs (~ 130 genes in Metastriata and ~ 200 genes in Ixodes) was considerably higher compared to any other Chelicerata species, vertebrates (~ 20 genes), and the housefly (four genes), suggesting that these genes have acquired an important function in tick biology. As discussed above, this expansion of the SULT genes in ticks appears to result from a combination of retroposition and tandem duplications, as observed in other tick gene families. We even observed many additional SULT gene fragments (i.e., genes that were not annotated due to partial sequences and no transcription support, and thus were not included in the gene counts), further showing that SULT gene duplication is highly dynamic and is possibly mediated by nearby transposable elements, some of the copies being eliminated. This observation agrees with an analysis of presence-absence variation (PAV) [96] based on tick genomes sequenced in another study [17]. The study of Rosani et al. identified sulfotransferases among sequences with a strong signal of PAV, along with transposable elements. We hypothesize that with such a dynamic SULT toolbox, ticks may differ at the population level or even at the individual level in the number of SULT copies. Different SULT genes may be adapted to detoxify different substrates, such as the host molecules ingested with the blood. Thus, the function of SULTs could be the digestion and/or detoxification of these vertebrate host blood components. Alternatively, the function of SULTs could be the modification and enhancement of some of the compounds in the tick saliva, creating a cocktail of molecules that modulates the host immune response. We also observed tissue-specific gene expression of SULTs; many SULT genes had higher expression in the Malpighian tubules, an excretory organ that has several functions in ticks, such as the excretion of nitrogenous wastes, osmoregulation, water balance, and detoxification. Other groups of SULT copies were primarily expressed in eggs or larvae, indicating also specialization of SULT gene expression by tick developmental stage. Overall, the remarkable expansion and dynamic evolution of SULT genes in ticks underscore their potentially critical role in tick biology. This gene family also exemplifies the role of retroposition and monoexonic copies in the rapid and species-specific amplification of some of the tick gene families.

Conclusions

In conclusion, our comparative analysis of the genomes of four species within the genus Ixodes, namely I. ricinus, I. pacificus, I. persulcatus, and I. hexagonus, shed new light on the genomic characteristics of ticks. Through genome sequencing and assembly, and by emphasizing the chromosome-level assembly in I. ricinus, we achieved a detailed understanding of the genomic architecture in Ixodes ticks. The macro-syntenic analysis highlighted the conservation of genomic organization between I. ricinus and I. scapularis, with few structural rearrangements. Our annotation efforts, including manual curation for I. ricinus, revealed that a high proportion of tick genes have an unusual, intronless structure, indicating frequent retroposition events. We have highlighted the significant role of gene family expansions in the evolution of tick genomes which have undergone highly dynamic gains and losses of genes, alongside expansions and contractions of gene families, showcasing a remarkable adaptation to their parasitic lifestyle. Our comprehensive analysis of the genomes of four Ixodes species offers a rich understanding of tick genomics and sets the stage for future functional studies.

Methods

Tick sampling

For I. ricinus, we used a laboratory population originally derived from wild ticks in the region of Neuchâtel, Switzerland, and maintained in the laboratory since 1980. This population was maintained in small numbers (~ 30 adults) and sexual reproduction was conducted on an annual basis, conditions expected to have favored inbreeding. Unfed adult females were isolated for sequencing. For I. pacificus, ticks were collected from the vegetation on November 9, 2017, in Del Valle Regional Park California (37° 48′ 15.71″ N, 122° 16′ 16.0″ W). Adult female ticks were put in RNAlater and shipped to the Nantes laboratory under cold conditions (~ 4 °C) for DNA extraction. Ticks from the species I. persulcatus were sampled in the Tokachi district, Hokkaido, Japan. Adult female ticks were fed on Mongolian gerbils, and their salivary glands (SG) were dissected. Ixodes hexagonus ticks were isolated from a live hedgehog collected at Oudon, France (47° 20′ 50″ N, 1° 17′ 09″ W) and sent to a wildlife recovery center (CVFSE, Oniris, Nantes, France). Fully engorged I. hexagonus nymphs were collected and maintained in the lab until they molted into adult ticks.

DNA extraction

For the four species of Ixodes, a single adult female was used for genome sequencing. For I. persulcatus, DNA from salivary glands (SGs) was extracted following a standard SDS/ProK and phenol/isopropanol protocol and stored at – 30 °C. For the three other tick species, DNA extraction of the whole body was performed following the salting-out protocol recommended for DNA extraction from single insects by 10X Genomics. The quantities of DNA engaged for each genome sequencing were 0.7 μg (I. persulcatus), 2.3 μg (I. hexagonus), 0.5 μg (I. ricinus), and 1.6 μg (I. pacificus).

10X library preparation and sequencing

For each species, linked read sequencing libraries were constructed using the Chromium Gel Bead and Library Kit (10X Genomics, Pleasanton, CA, USA) and the Chromium instrument (10X Genomics) following the manufacturer’s instructions. Prior to DNA library construction, tick DNA fragments were size-selected using the BluePippin pulsed field electrophoresis system (Sage Science, Beverly, MA, USA); size selection was adjusted according to the initial size of the extracted DNA fragments (ranging in size from 20 to 80 Kb). Approximately 1 ng of high molecular weight (HMW) genomic DNA (gDNA) was used as input for Chromium Genome library preparation (v1 and v2 chemistry), which was added on the 10X Chromium Controller to create Gel Bead in-Emulsions (GEMs). The Chromium controller partitions and barcodes each HMW gDNA fragment. The resulting genome GEMs underwent isothermal incubation to generate 10X barcoded amplicons from which an Illumina library was constructed. The resulting 10X barcoded sequencing library was then quantified by Qubit Fluorometric Quantitation; the insert size was checked using an Agilent 2100 and finally quantified by qPCR (Kapa Biosystems, Wilmington, MA, USA). These libraries were sequenced with 150 bp paired-end reads on an Illumina HiSeq 4000 or NovaSeq 6000 instrument (Illumina, San Diego, CA, USA).

Hi-C library preparation and sequencing

Hi-C library preparation and sequencing were performed by Dovetail Genomics (Dovetail Genomics, Scotts Valley, CA, USA). A total of one Chicago library (sample DTG_CHI_604) and two Hi-C libraries (samples DTG_HiC-739 and DTG_HiC-765) were prepared, following the protocol described in [126]. The two Hi-C libraries represented two independent preparations from the same sample. The concentrations and mean library sizes for each sample were as follows: DTG_CHI_604 (Tick Chicago) had a concentration of 45.75 nM (18.80 ng/μL) and a mean library size of 442 nt; DTG_HiC-739 (Tick Hi-C) had a concentration of 26.25 nM (9.28 ng/μL) and a mean library size of 493 nt; DTG_HiC-765 (Tick Hi-C) had a concentration of 30.25 nM (11.20 ng/μL) and a mean library size of 525 nt. Each library had a volume of 40 μL. All three libraries were sequenced at Genoscope (Évry, France) using an Illumina HiSeq 4000 instrument (Illumina, San Diego, CA, USA). The libraries were loaded onto a patterned flowcell at a concentration of 200 pM per lane, with each library occupying a single lane. Sequencing generated the following paired-end reads: DTG_CHI_604 produced 339,756,352 reads, DTG_HiC-739 produced 324,999,014 reads, and DTG_HiC-765 produced 310,469,272 reads. These high coverage read counts were crucial for downstream analysis.

Illumina short-read filtering

Short Illumina reads were bioinformatically post-processed as previously described [127, 128] to filter out low quality data. Low-quality nucleotides (Q < 20) were discarded from both read ends, Illumina sequencing adapters and primer sequences were removed, and only reads ≥ 30 nucleotides were retained. These filtering steps were done using an in-house-designed software based on the FastX package (https://www.genoscope.cns.fr/fastxtend/). Genomic reads were then mapped to the phage phiX genome and aligned reads were identified and discarded using SOAP aligner [129] (default parameters) and the Enterobacteria phage PhiX174 reference sequence (GenBank: NC_001422.1). Standard metrics for sequencing data are available in Additional File 1: Table S20).

Genome sizes and heterozygosity rate

Genome sizes and heterozygosity rates were estimated using Genomescope2 [130] with a kmer size of 31. Genome size ranged between 1.8 Gbp (I. pacificus) and 2.6 Gbp (I. ricinus and I. hexagonus). Heterozygosity rates varied between 0.82% (I. ricinus) and 3.17% (I. pacificus).

Tick genome assemblies (10X Genomics) and scaffolding

Genomes were assembled using the Supernova software from 10X Genomics. I. ricinus was assembled using the 1.2.0 version while I. hexagonus, I. pacificus, and I. persulcatus were assembled with the 2.1.1 version of the assembler. Hi-C scaffolding of the I. ricinus genome assembly was performed by Dovetail using both Chicago and Hi-C libraries. The RagTag [131] software (version 1.0.1) was used to scaffold the genomes of I. hexagonus, I. pacificus, and I. persulcatus by using the chromosome-scale assembly of I. ricinus as an anchor. RagTag was launched with default options and with Minimap 2.17 [129] for the mapping step.

Gene prediction

The genome assemblies of I. ricinus, I. hexagonus, I. pacificus, and I. persulcatus were masked using RepeatMasker (http://repeatmasker.org/, default parameters) with Metazoa transposable elements from Repbase (version 20,150,807 from RepeatMasker package) and RepeatModeler [132] with default parameters (version v2.0.1). The proteomes of Varroa destructor, Centruroides sculpturatus, and Stegodyphus mimosarum were used to detect conserved genes in the four tick genome assemblies. In addition, a translated pan-transcriptome of 27 tick species [20] was aligned on the four tick genome assemblies. The proteomes were aligned against genome assemblies in two steps. BLAT [133] with default parameters was used to efficiently delineate a genomic region corresponding to the given protein. The best match and matches with a score ≥ 90% of the best match score were retained and alignments were refined using Genewise [134] with default parameters, which allows for accurate detection of intron/exon boundaries. Alignments were kept if more than 50% of the length of the protein was aligned to the genome.

We also used RNA-Seq data to allow the prediction of expressed and/or specific genes. Two transcriptome libraries of synganglia [135] from I. ricinus (PRJEB40724), a library from the whole body of I. ricinus (GFVZ00000000.1), and a pan-transcriptome of 27 different tick species [20] were aligned on the four genome assemblies. As for protein sequences, these transcripts were first aligned with BLAT where the best match (based on the alignment score) was selected. Alignments were then refined within previously identified genomic regions using Est2Genome [136] to define intron/exon boundaries. Alignments were retained if more than 80% of the transcript length was aligned to the genome with a minimum percent identity of 95%.

Genes were predicted on the four genome assemblies by integrating protein and transcript alignments as well as ab initio predictions using a combiner called Gmove [137]. Single-exon genes with a coding sequence length smaller or equal to 100 amino acids were filtered out. Additionally, putative transposable elements (TEs) were removed from the predicted genes using three different approaches: (i) genes that contain a TE domain from InterPro, (ii) transposon-like genes detected using TransposonPSI (http://transposonpsi.sourceforge.net/, default parameters), (iii) and genes overlapping repetitive elements. Finally, InterProScan [138] (version v5.41–78.0, with default parameters) was used to detect conserved protein domains in predicted genes. Predicted genes without conserved domains and covered by at least 90% of their cumulative exonic length by repeats, or matching TransposonPSI criteria or selected IPR domains, were removed from the gene set.

Genomic web portal, Apollo server, and manual curation

A genomic portal was set up at the BIPAA platform (https://bipaa.genouest.org/), providing access to raw data and to a set of web tools facilitating data exploration and analysis (BLAST application [139], including a JBrowse genome browser [140], GeneNoteBook [141]). Automatic functional annotation of genes was performed using Diamond 2.0.13 [142] against the nr databank (2022–12-11), EggNog-Mapper 2.1.9 [143], InterProScan 5.59–91.0 [138], and Blast2Go 1.5.1 (2021.04 database) [144]. Results were then made available into GeneNoteBook. The BIPAA Apollo [145] server was also used in order to improve the annotation for I. ricinus, based on expert knowledge of several functional groups of gene. Based on the automatic annotation, alignments of RNA-Seq data (reads of selected libraries and contigs of reference transcriptomes), localization of TEs, and of non-coding RNAs, experts were able to perform manual curation of the annotation (gene model edition and functional annotation refinement, including gene naming). Manual curation data was merged with OGS 1.0 (automatic annotation) to produce a new reference annotation named OGS1.1. This merging was performed using ogs_merge (https://github.com/genouest/ogs-tools version 0.1.2). The OGS1.1 was functionally annotated using the same procedure as OGS1.0 and made available into GeneNoteBook.

Synteny analysis

Synteny analyses between I. scapularis and I. ricinus were performed using CHRoniCle (January 2015) and SynChro (January 2015) [146], which use protein similarity to determine syntenic blocks across these two genomes. The results were parsed and then plotted into chromosomes using chromoMap R package (v0.4.1 [147]). Parity plots were built using a homemade R script (R version v4.2.2) based on hit tables, for two comparisons: I. ricinus versus I. scapularis, and I. ricinus versus D. silvarum. Hit tables were generated using BlastP (NCBI Blast + v2.13.0 [139]) with the following options: e-value cutoff = 1e − 5, max HSPs = 1, and max target sequences = 1.

Transposable element annotation

Repeated elements (REs) and transposable elements (TEs) were first annotated by RepeatModeler (v 2.0.2a (Smit, AFA, Hubley, R. RepeatModeler Open-1.0. 2008–2015 http://www.repeatmasker.org)) using NCBI BLAST for alignment. Predicted TEs or REs matching with gene sequences from the OGS1.1 prediction were removed from the RepeatModeler results. Finally, a second round of annotation was performed by RepeatMasker (v 4.1.1 (Smit, AFA, Hubley, R & Green, P. RepeatMasker Open-4.0. 2013–2015, http://www.repeatmasker.org)) using the filtered results of RepeatModeler combined with curated sequences from metazoans contained in the Dfam database (v3.8). For Bov-B LINEs, sequences annotated as RTE/Bov-B by RepeatModeler were extracted and blasted by tblastx against curated Bov-B LINE sequences contained in the Dfam database (downloaded on 5 December 2023). Only hits with an e-value lower than 1e − 3 were kept and sequences aligning with curated Bov-B sequences were considered as BovB LINE retrotransposons.

Gene clustering, definition of gene families

In addition to the four genomes newly sequenced, the genomes of other tick species and other arachnids were included for a study of phylogeny and evolutionary dynamics of gene repertoires—these sequences are listed in Additional File 1: Table S21. Genome completeness was assessed by BUSCO (v5.4.4) for each species [148] and using the Arachnida database (OrthoDB10) [149], with an e-value threshold of 1e − 5. For each species, the longest protein isomorph of each gene was extracted using AGAT tools (v0.8.1, https://zenodo.org/records/5834795) in order to retain a single sequence per gene. Three clustering methods were then compared, using several well-annotated gene families as benchmark tests. We found that SiLiX (v1.2.11 [150]) generated larger gene families that typically matched well the curated gene families, whereas OrthoMCL (v2.0.9 [151]) and OrthoFinder (v2.5.2 [152]) defined gene families at a much smaller grain (clades with fewer genes, of more closely related sequences, as shown for two examples of gene families in Additional File 2: Fig. S11). We therefore decided to use SiLiX for the clustering. Pairwise comparisons between all arthropod species were carried out using BlastP (NCBI Blast + v2.13.0 [139]). Parameters for clustering were a minimum percentage identity of 0.30 and a minimum length of 75 residues. The SiLiX output was then parsed to obtain a contingency table, while a BlastP search against the Swiss-Prot database (12th of January 2023 download), using an e-value threshold of 1e − 5, was performed to attribute gene ontology (GO) terms to each gene family.

Phylogeny of the Chelicerata

The SiLiX output was then parsed to obtain a contingency table, while a BlastP search against the Swiss-Prot database (12th of January 2023 download), using an e-value threshold of 1e − 5, was performed to attribute GO terms to each gene family. For gene families containing a single gene sequence in each species, protein sequences were aligned using MAFFT (v7.487 [153]) with the “genafpair” alignment algorithm and 1000 iterations. The generated alignments were trimmed with TrimAl (v1.4.1 [154]), using a gap threshold of 0.6, and concatenated using goalign (v0.3.2 [155]). A phylogenetic tree was then generated with IQ-TREE (v2.1.3 [156]). The best model was identified by the software using BIC and branch supports were estimated with 1000 bootstrap replicates and SH-like approximate likelihood ratio tests. Non-concatenated alignments were also used to construct trees for each family in order to check branch length and topology for all single-copy families. Owing to the low completeness of the Hae. longicornis and Hya. asiaticum genomes, illustrated by BUSCO metrics and lower number of gene families compared to other ticks (see the “Results” section), we removed these species from further analyses. Our phylogeny thus included 107 shared single-copy orthologs, for 19 species of Chelicerata.

To further evaluate the phylogenetic relationships among Ixodes species included in our data set, some of them being very closely related, the nucleotide sequences of the single-copy genes of the five species included in our study were aligned using MAFFT (with the “genafpair” alignment algorithm, 1000 iterations and using the BLOSUM 80 matrix). Aligned sequences were trimmed with TrimAl (using a gap threshold of 0.8) before being concatenated with goalign. Finally, a phylogenetic tree was built with IQ-TREE. As described previously, the best model was identified by IQ-TREE using BIC and branch supports were estimated with 1000 bootstrap replicates and SH-like approximate likelihood ratio tests.

Global study of the evolutionary dynamics of gene families

To analyze changes in gene family size across the phylogeny, CAFE5 (v5.0 [157]) was run using the previously generated contingency table and the species phylogenetic tree, rooted and previously transformed into an ultrametric tree using phytools (v 1.5–1 [158]) and the ape R packages (v5.7–1 [159]) with R version v4.2.2. The horseshoe crab, Limulus polyphemus, was used as an outgroup to root the tree. The error model was estimated before running the base model with a dataset cleaned of highly divergent families. The base model predicted a lambda value of 0.451, which was used for a second run of CAFE5 with the full family dataset. This second run was then used to predict gene family expansion/contraction.

Metabolic pathways

KEGG orthology (KO) numbers were assigned to each protein sequence of the Chelicerata species used in this study using eggNOG-mapper (v2.1.10 [143]. KO numbers of each species were regrouped into five taxonomic groups (Ixodes spp., Metastriata ticks, Mesostigmata + Acariformes, spiders and finally a group including C. sculpturatus and L. polyphemus). Assessment of gene presence/absence was made based on a majority rule in each group (a gene being considered present in each group if present in the majority of its species), to avoid both the effects of incomplete or spurious annotation, or of contamination. Metabolic maps and reactions were then reconstructed using the Reconstruct tool of KEGG Mapper [160]. Maps and detailed patterns of presence/absence in each species are available in the BIPAA webpage for the I. ricinus genome (see https://bipaa.genouest.org/sp/ixodes_ricinus/download/).

Expression atlas

Transcriptomic data of I. ricinus were downloaded from the NCBI SRA archive (see Additional File 1: Table S22 for detailed information) using the SRAtoolkit (v3.0.0 (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software)). A preliminary study of read quality, quantity, and homogeneity between replicates was first conducted, which led us to retain only a sample of the many data sets already published for this species. Whenever possible, we retained two replicates for a given combination of tissue and condition (typically either unfed ticks, or half-replete ticks), to include a minimal level of replication. The transcriptome datasets were filtered and trimmed using Trimmomatic (v0.39 [161]). To filter rRNA sequences from the datasets, paired-reads were mapped on an RNA-Seq contig from I. ricinus described previously [162]; this sequence of 7065 bp was found to represent a cluster with complete 18S, 5.8S, and 28S subunits of rRNA. Mapping was performed with Hisat2 (v2.2.1 [163]). Unmapped paired-reads (non-rRNA) were extracted using bamUtil (v1.0.14 [164]) and Samtools (v1.16.1 [165]), and read quality was checked using MultiQC (v1.14 [166]) on FastQC (v0.11.7) outputs. Another run of Trimmomatic was then performed on the retained reads, which were then mapped on the I. ricinus genome with Hisat2. Mapped reads were finally sorted with Samtools, and the number of mapped reads per gene was retrieved by the FeatureCount R function contained in the Rsubread package (v3.16 [167]). Counts were then converted into transcripts per million (TPMs)—Additional File 1: Table S23. The Spearman correlation heatmap was built for each gene family using the heatmap.2 function (gplots packages v3.1.3), and tree/gene model/heatmap figures were built using a homemade script using treeio (v1.22.0), ggtree (v3.6.2 [168] and ggplot2 (v3.4.2 [169] https://ggplot2.tidyverse.org)) packages.

Annotation of structural and regulatory non-coding elements

To ensure reliable and accurate annotation of structural and regulatory non-coding elements, we used several approaches, software, and databases. Initially, we used Infernal and the latest version of the Rfam database to identify ncRNAs and cis-regulatory elements in the I. ricinus genome [170, 171]. Subsequently, we used tRNAscan-SE to annotate transfer RNAs in the I. ricinus genome [172] and sRNAbench to identify the most accurate set of miRNAs and their genomic positions [173].

For the annotation of long non-coding RNAs (lncRNAs), we relied on the lncRNA dataset compiled and analyzed by Medina et al. [13, 27]. These studies resulted in a consensus set of lncRNAs, which we considered to be high confidence lncRNAs. First, we confirmed the absence of coding properties in these lncRNAs using CPC2 [174]. Next, we aligned the lncRNAs to the I. ricinus genome using Blat and retained alignments with a score above 90. Finally, to eliminate potential assembly artifacts and avoid interference with the set of coding RNAs, we used gffcompare to remove any lncRNAs that overlapped with coding RNAs [165] [175].

Annotation and phylogeny of protease inhibitors in I. ricinus

To annotate proteins containing the Kunitz domain (KDCP) and cystatins, we combined I. ricinus mRNA sequences from different sources: our initial prediction of coding sequences for the I. ricinus genome (version OGS1.0), the National Center for Biotechnology Information (NCBI), and the transcriptome assembled by Medina et al. [13]. We used TransDecoder to extract coding sequences (CDSs) from the mRNA sequences. To eliminate redundancy, we used CD-HIT [176] coding sequences: sequences with a CDS showing 98% identity and at least 70% coverage were identified as redundant, and the longer sequence in each cluster was chosen. Next, InterProScan and BlastP were used to identify proteins belonging to the Cystatin and Kunitz families [138, 177, 178]. These sequences were then aligned with the I. ricinus genome using Blat [133]. The automatic annotation was then manually curated on the basis of expression and junction data using the Apollo annotation platform [179], the result of all our annotations being present in the OGS1.1 version of the genome prediction.

For phylogenetic analysis, we included gene sequences from I. scapularis [18] as well as from the four genomes sequenced in the present study. For the three species other than I. ricinus sequenced in our study, cystatins and Kunitz domain-containing proteins were annotated using the same method as described for I. ricinus, but no manual curation was performed. SignalP was used to identify the signal peptide of cystatins and Kunitz domain-containing proteins [180]. Clustal Omega was then used to perform a multiple sequence alignment of the mature protein sequence [181]. Spurious sequences and misaligned regions from the multiple alignment were removed using trimAl [154]. The phylogenetic tree was then constructed using FastME [182] and visualized with the R software ggtree [168]. Detailed lists of KDCPs and cystatins are given in Additional File 1: Table S7.

Serpins

The search for serpins in the genome was performed by BLAST, either with blastn algorithm (protein query against translated nucleotide sequences) or tblastn (translated nucleotides query against translated nucleotide sequences), and sequences from the original prediction were manually curated. Sequences were aligned and edited as proteins by using the ClustalW algorithm and Maximum likelihood method and the JTT matrix-based model and bootstrap method with 1000 replications was used to calculate the reliability of tree branches.

Metallobiology and ferritins

The search for heme synthesis and degrading enzymes in the I. ricinus genome was performed by BLAST, with the tblastn or blastp algorithms, using the sequences of Dermanyssus gallinae [183] as queries. The Alphafold2 algorithm was used to predict the monomeric structure of ferritin-1 and ferritin-2 identified in the I. ricinus genome. The resulting PDB files were used for Swiss Homology Modelling [184] to predict the structure of their multimeric assemblies, using human Ferritin heavy chain as a template (3ajo.1.A; seq identity 67%, CMQE 0.89). Measures of the external diameters of the protein multimers were performed in PyMol.

Chemoreceptors: annotation and phylogenetic study

The annotation of the two major chemosensory gene families of I. ricinus was based on known sequences from the closely related species I. scapularis [73]. The I. ricinus scaffolds with significant blast hits (cutoff value 1e − 30) were retrieved to generate a subset of the genome for each chemosensory gene family. Gene models were obtained after amino acid sequences were aligned to this subset of the genome using Exonerate [185]. All gene models thus generated were manually validated or corrected via Apollo. Based on homology with I. scapularis sequences, matching parts were joined when located on different scaffolds. The classification of deduced proteins and their integrity were verified using blastp against the nonredundant (NR) GenBank database. When genes were suspected to be split on different scaffolds, protein sequences were merged for further analyses.

Following alignment and phylogenetic proximity, I. ricinus sequences were named after their I. scapularis orthologs according to [73]. The abbreviations Iric and Isca are used before the gene names to clarify the species, I. ricinus and I. scapularis, respectively. The gene family was defined by IR or GR for ionotropic or gustatory receptors, respectively, and was followed by a number designating a different gene sequence. A supplementary number was given for closely related sequences. For example, if a receptor was named IscaIRX for I. scapularis, the closest I. ricinus receptor was named IricIRX. If other sequences were closely related, they were named IricIRX-1, IricIRX-2, etc. The IRs and GRs of the phytoseid predatory mite, G. occidentalis [186], were ultimately added to the phylogenetic analysis as well as the Drosophila melanogaster IRs. On the GR dendrogram, we used the Mocc abbreviations, while for the IR dendrogram, we did not display the name abbreviations for reasons of clarity. Finally, for the phylogenetic analysis of the conserved IRs, we included the two Ixodes spp. (Ir and Is), G. occidentalis (Mo), D. melanogaster (Dm), Spodoptera littoralis (Sp), and Heliconius melpomene (Hm).

Multiple sequence alignment was performed by MAFFT v7 (Katoh & Standley 2013), and maximum-likelihood phylogenies were built using the Smart Model Selection [182] in PhyML 3.0 [187] (http://www.atgc-montpellier.fr/phyml/) which automatically selects the best substitution model. Node support was estimated using the approximate likelihood ratio test via SH-like aLRT [188]. Trees were retrieved with FigTree v1.4.4 (https://github.com/rambaut/figtree), and images were edited with PowerPoint software.

Defensins

Identification of genes encoding defensins in the I. ricinus genome was performed by tBlastn search (default parameter) using annotated I. ricinus prepro-defensin transcripts from a variety of I. ricinus transcriptomes deposited in GenBank (NCBI) as queries. Two types of defensin-related genes were considered, prepro-defensins and defensin-like peptides. Prepro-defensins were defined using three criteria for canonical tick defensins: (i) conserved pattern of cysteine residues in the C-terminal part of the molecule, (ii) presence of the furin cleavage motif (R)VRR, and (iii) mature peptide of ~ 4 kDa. Sequences that differed in any of these criteria were termed defensin-like peptides. Phylogenetic analysis of the prepro-defensins and defensin-like proteins of I. ricinus was performed using sequences including other Ixodes sp. (I. scapularis, I. persulcatus, I. hexagonus, I. pacificus) and constructed as above for Kunitz-type and cystatin protease inhibitors.

Detoxification

We used different sets of tick CYP, CCE, GST, and ABC transporter proteins from published studies (for I scapularis) or from the NCBI website automatic annotation databases to search the I. ricinus genome using TBLASTN with Galaxy [189], Exonerate, and Scipio to align protein sequences to the genome and define intron/exon boundaries [190]. All generated gene models were manually validated or corrected based on homology with other tick sequences and aligned with RNA-seq data when available. In addition, a direct search was performed using the keyword search on the NCBI website to search for specific protein domains in the databases as well as a search in the automated annotation for I. ricinus. The classification of the deduced proteins and their integrity were checked using BlastP against the non-redundant GenBank database. When genes were suspected to be split into different scaffolds, the protein sequences were merged for further analysis. All active sites were confirmed using the NCBI CD search program. Phylogenetic trees were constructed using PhyML [187] based on the best substitution model determined by the SMS server [191], and both SPR (Subtree-Pruning-Regrafting) and NNI (Nearest-Neighbor-Interchange) methods were applied to improve tree topology. Branch supports were estimated by a Bayesian transformation of aLRT (aBayes) [192]. Dendrograms were constructed and edited using the FigTree software (http://tree.bio.ed.ac.uk/software/figtree/). RNA-Seq data, as described on the “Expression atlas” section, were used to construct heatmaps for each enzyme family using TBtools [193].

Manual curation of cytosolic sulfotransferases (SULTs) was carried out for I. ricinus, adding several new genes that had not been predicted in the initial automatic annotation, in particular, mono-exonic genes. For a phylogenetic study of this expanded gene family, we included sequences from the horseshoe crab P. polyphemus and a spider, Parasteatoda tepidariorum, as well as from two model organisms where SULTs have been well characterized, humans and the housefly (sult1-4). For human sequences, we chose one sequence in each of the major clades of SULTs, as previously characterized [125]. Sequences assumed to be incomplete based on the predicted gene model and sequence length were discarded. Sequences were aligned with Muscle. The alignment was cleaned with Gblocks, using the following options: -b2 = 112 -b3 = 20 -b4 = 2 -b5 = h, implying a low-stringency (small blocks and gaps in up to half of the sequences being allowed). A ML phylogenetic tree was obtained with IQ–tree [194], using Model Finder to determine the best model of substitution [195]. Branch support was assessed using 1000 ultrafast bootstrap replicates [196], and the resulting tree was graphically edited with ITOL [197].

Data availability

Sequencing data are available as follows: for I. persulcatus, 10X Genomics reads [198], assembly and gene predictions [199]; for I. hexagonus, 10X Genomics reads [200], assembly and gene predictions [201]; for I. pacificus, 10X Genomics reads [202], assembly and gene predictions [203]; for I. ricinus, 10X Genomics and Hi-C reads [204], assembly and gene predictions [205]. 

A genome page for each of the four species of this study is available at https://bipaa.genouest.org/is/ticks/: it contains blast forms, a GeneNoteBook page providing detailed information on each individual gene, and a genome browser and gives the possibility to download sequences, annotation files, expression data (RNA-Seq based atlas, for I. ricinus), and maps of metabolic pathways.

Abbreviations

SULTs:

Cytosolic sulfotransferases

TEs:

Transposable elements

LINEs:

Long interspersed nuclear elements

GO:

Gene ontology

MFS:

Major facilitator superfamily

GSTs:

Glutathione S-transferases

TRAFs:

Tumor necrosis factor receptor-associated factors

UTR:

Untranslated region

IRE:

Iron responsive element

GRs:

Gustatory receptors

IRs:

Ionotropic receptors

ORs:

Odorant receptors

DLP:

Defensin-like peptides

CYPs:

Cytochromes P450

CCEs:

Carboxylesterases

UGTs:

UDP-glycosyltransferases

ABC:

ATP-binding cassette

WGD:

Whole genome duplication

PAV:

Presence-absence variation

HMW:

High molecular weight

bp:

Base pairs

CDS:

Coding sequence

KDCP:

Kunitz domain-containing protein

NCBI:

National Center for Biotechnology Information

OGS:

Official Gene Get

References

  1. Jongejan F, Uilenberg G. The global importance of ticks. Parasitology. 2004;129:S3-14.

    Article  PubMed  Google Scholar 

  2. Mans BJ, de Castro MH, Pienaar R, de Klerk D, Gaven P, Genu S, et al. Ancestral reconstruction of tick lineages. Ticks Tick-Borne Dis. 2016;7:509–35.

    Article  PubMed  Google Scholar 

  3. Dunlop JA. Geological history and phylogeny of Chelicerata. Arthropod Struct Dev. 2010;39:124–42.

    Article  PubMed  Google Scholar 

  4. Sharma PP, Kaluziak ST, Pérez-Porro AR, González VL, Hormiga G, Wheeler WC, et al. Phylogenomic interrogation of Arachnida reveals systemic conflicts in phylogenetic signal. Mol Biol Evol. 2014;31:2963–84.

    Article  CAS  PubMed  Google Scholar 

  5. Lozano-Fernandez J, Tanner AR, Giacomelli M, Carton R, Vinther J, Edgecombe GD, et al. Increasing species sampling in chelicerate genomic-scale datasets provides support for monophyly of Acari and Arachnida. Nat Commun. 2019;10:2295.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Ballesteros JA, Santibáñez López CE, Kováč Ľ, Gavish-Regev E, Sharma PP. Ordered phylogenomic subsampling enables diagnosis of systematic errors in the placement of the enigmatic arachnid order Palpigradi. Proc R Soc B Biol Sci. 2019;286:20192426.

  7. Zhang Y-X, Chen X, Wang J-P, Zhang Z-Q, Wei H, Yu H-Y, et al. Genomic insights into mite phylogeny, fitness, development, and reproduction. BMC Genomics. 2019;20:954.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Sharma PP, Ballesteros JA, Santibáñez-López CE. What is an “arachnid”? Consensus, consilience, and confirmation bias in the phylogenetics of Chelicerata. Diversity. 2021;13:568.

    Article  CAS  Google Scholar 

  9. Guglielmone AA, Robbins RG, Apanaskevich DA, Petney TN, Estrada-Peña A, Horak IG, et al. The Argasidae, Ixodidae and Nuttalliellidae (Acari: Ixodida) of the world: a list of valid species names. Zootaxa. 2010;2528:1–28.

    Article  Google Scholar 

  10. Mans BJ, de Klerk D, Pienaar R, Latif AA. Nuttalliella namaqua: a living fossil and closest relative to the ancestral tick lineage: implications for the evolution of blood-feeding in ticks. PLoS ONE. 2011;6: e23675.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Mans BJ, de Klerk D, Pienaar R, de Castro MH, Latif AA. The mitochondrial genomes of Nuttalliella namaqua (Ixodea: Nuttalliellidae) and Argas africolumbae (Ixodae: Argasidae): estimation of divergence dates for the major tick lineages and reconstruction of ancestral blood-feeding characters. PLoS ONE. 2012;7.

  12. Mans BJ. Chemical equilibrium at the tick–host feeding interface: a critical examination of biological relevance in hematophagous behavior. Front Physiol. 2019;10.

  13. Medina JM, Jmel MA, Cuveele B, Gómez-Martín C, Aparicio-Puerta E, Mekki I, et al. Transcriptomic analysis of the tick midgut and salivary gland responses upon repeated blood-feeding on a vertebrate host. Front Cell Infect Microbiol. 2022;12.

  14. Mans BJ, Featherston J, de Castro MH, Pienaar R. Gene duplication and protein evolution in tick-host interactions. Front Cell Infect Microbiol. 2017;7.

  15. Geraci NS, Spencer Johnston J, Paul Robinson J, Wikel SK, Hill CA. Variation in genome size of argasid and ixodid ticks. Insect Biochem Mol Biol. 2007;37:399–408.

    Article  CAS  PubMed  Google Scholar 

  16. Gulia-Nuss M, Nuss AB, Meyer JM, Sonenshine DE, Roe RM, Waterhouse RM, et al. Genomic insights into the Ixodes scapularis tick vector of Lyme disease. Nat Commun. 2016;7:10507.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Jia N, Wang J, Shi W, Du L, Sun Y, Zhan W, et al. Large-scale comparative analyses of tick genomes elucidate their genetic diversity and vector capacities. Cell. 2020;182:1328-1340.e13.

    Article  CAS  PubMed  Google Scholar 

  18. De S, Kingan SB, Kitsou C, Portik DM, Foor SD, Frederick JC, et al. A high-quality Ixodes scapularis genome advances tick science. Nat Genet. 2023;55:301–11.

    Article  CAS  PubMed  Google Scholar 

  19. Nuss AB, Lomas JS, Reyes JB, Garcia-Cruz O, Lei W, Sharma A, et al. The highly improved genome of Ixodes scapularis with X and Y pseudochromosomes. Life Sci Alliance. 2023;6.

  20. Charrier NP, Hermouet A, Hervet C, Agoulon A, Barker SC, Heylen D, et al. A transcriptome-based phylogenetic study of hard ticks (Ixodidae). Sci Rep. 2019;9.

  21. Keirans JE, Needham GR, Oliver Jr JR. The Ixodes ricinus complex worldwide: diagnosis of the species in the complex, hosts and distribution. In: Acarology IX: Symposia. Colombus, Ohio; 1999. p. 341–7.

  22. Xu G, Fang QQ, Keirans JE, Durden LA. Molecular phylogenetic analyses indicate that the ixodes ricinus complex is a paraphyletic group. J Parasitol. 2003;89:452–7.

    Article  CAS  PubMed  Google Scholar 

  23. Sands AF, Apanaskevich DA, Matthee S, Horak IG, Harrison A, Karim S, et al. Effects of tectonics and large scale climatic changes on the evolutionary history of Hyalomma ticks. Mol Phylogenet Evol. 2017;114:153–65.

    Article  PubMed  Google Scholar 

  24. Oliver JH. Cytogenetics of mites and ticks. Annu Rev Entomol. 1977;22:407–29.

    Article  PubMed  Google Scholar 

  25. Ribeiro JMC, Bayona-Vásquez NJ, Budachetri K, Kumar D, Frederick JC, Tahir F, et al. A draft of the genome of the Gulf Coast tick. Amblyomma maculatum Ticks Tick-Borne Dis. 2023;14: 102090.

    Article  PubMed  Google Scholar 

  26. Van Dam MH, Trautwein M, Spicer GS, Esposito L. Advancing mite phylogenomics: designing ultraconserved elements for Acari phylogeny. Mol Ecol Resour. 2019;19:465–75.

    Article  PubMed  Google Scholar 

  27. Medina JM, Abbas MN, Bensaoud C, Hackenberg M, Kotsyfakis M. Bioinformatic analysis of Ixodes ricinus long non-coding RNAs predicts their binding ability of host miRNAs. Int J Mol Sci. 2022;23:9761.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Donohue KV, Khalil SMS, Ross E, Grozinger CM, Sonenshine DE, Michael RR. Neuropeptide signaling sequences identified by pyrosequencing of the American dog tick synganglion transcriptome during blood feeding and reproduction. Insect Biochem Mol Biol. 2010;40:79–90.

    Article  CAS  PubMed  Google Scholar 

  29. Waldman J, Xavier MA, Vieira LR, Logullo R, Braz GRC, Tirloni L, et al. Neuropeptides in Rhipicephalus microplus and other hard ticks. Ticks Tick-Borne Dis. 2022;13: 101910.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Meyer H, Buhr A, Callaerts P, Schiemann R, Wolfner MF, Marygold SJ. Identification and bioinformatic analysis of neprilysin and neprilysin-like metalloendopeptidases in Drosophila melanogaster. MicroPublication Biol. 2021;2021.

  31. Bland ND, Pinney JW, Thomas JE, Turner AJ, Isaac RE. Bioinformatic analysis of the neprilysin (M13) family of peptidases reveals complex evolutionary and functional relationships. BMC Evol Biol. 2008;8:16.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Jmel MA, Voet H, Araújo RN, Tirloni L, Sá-Nunes A, Kotsyfakis M. Tick salivary Kunitz-type inhibitors: targeting host hemostasis and immunity to mediate successful blood feeding. Int J Mol Sci. 2023;24:1556.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Francischetti IMB, Valenzuela JG, Andersen JF, Mather TN, Ribeiro JMC. Ixolaris, a novel recombinant tissue factor pathway inhibitor (TFPI) from the salivary gland of the tick, Ixodes scapularis: identification of factor X and factor Xa as scaffolds for the inhibition of factor VIIa/tissue factor complex. Blood. 2002;99:3602–12.

    Article  CAS  PubMed  Google Scholar 

  34. Nazareth RA, Tomaz LS, Ortiz-Costa S, Atella GC, Ribeiro JMC, Francischetti IMB, et al. Antithrombotic properties of Ixolaris, a potent inhibitor of the extrinsic pathway of the coagulation cascade. Thromb Haemost. 2006;96:7–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. De Paula VS, Sgourakis NG, Francischetti IMB, Almeida FCL, Monteiro RQ, Valente AP. NMR structure determination of Ixolaris and factor X(a) interaction reveals a noncanonical mechanism of Kunitz inhibition. Blood. 2019;134:699–708.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Dermauw W, Van Leeuwen T, Feyereisen R. Diversity and evolution of the P450 family in arthropods. Insect Biochem Mol Biol. 2020;127: 103490.

    Article  CAS  PubMed  Google Scholar 

  37. De Rouck S, İnak E, Dermauw W, Van Leeuwen T. A review of the molecular mechanisms of acaricide resistance in mites and ticks. Insect Biochem Mol Biol. 2023;159: 103981.

    Article  PubMed  Google Scholar 

  38. Claudianos C, Ranson H, Johnson RM, Biswas S, Schuler MA, Berenbaum MR, et al. A deficit of detoxification enzymes: pesticide sensitivity and environmental response in the honeybee. Insect Mol Biol. 2006;15:615–36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Grbić M, Van Leeuwen T, Clark RM, Rombauts S, Rouzé P, Grbić V, et al. The genome of Tetranychus urticae reveals herbivorous pest adaptations. Nature. 2011;479:487–92.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Bajda S, Dermauw W, Greenhalgh R, Nauen R, Tirry L, Clark RM, et al. Transcriptome profiling of a spirodiclofen susceptible and resistant strain of the European red mite Panonychus ulmi using strand-specific RNA-seq. BMC Genomics. 2015;16:974.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Dermauw W, Wybouw N, Rombauts S, Menten B, Vontas J, Grbić M, et al. A link between host plant adaptation and pesticide resistance in the polyphagous spider mite Tetranychus urticae. Proc Natl Acad Sci. 2013;110:E113–22.

    Article  CAS  PubMed  Google Scholar 

  42. Neese PA, E. Sonenshine D, Kallapur VL, Apperson CS, Roe RM. Absence of insect juvenile hormones in the American dog tick, Dermacentor variabilis (Say) (Acari:Ixodidae), and in Ornithodoros parkeri Cooley (Acari:Argasidae). J Insect Physiol. 2000;46:477–90.

  43. Zhu J, Khalil SM, Mitchell RD, Bissinger BW, Egekwu N, Sonenshine DE, et al. Mevalonate-farnesal biosynthesis in ticks: comparative synganglion transcriptomics and a new perspective. PLoS ONE. 2016;11: e0141084.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Yang Z-M, Wu Y, Li F-F, Zhou Z-J, Yu N, Liu Z-W. Genomic identification and functional analysis of JHAMTs in the pond wolf spider. Pardosa pseudoannulata Int J Mol Sci. 2021;22:11721.

    Article  CAS  PubMed  Google Scholar 

  45. Smykal V, Dolezel D. Evolution of proteins involved in the final steps of juvenile hormone synthesis. J Insect Physiol. 2023;145: 104487.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kotál J, Stergiou N, Buša M, Chlastáková A, Beránková Z, Řezáčová P, et al. The structure and function of Iristatin, a novel immunosuppressive tick salivary cystatin. Cell Mol Life Sci. 2019;76:2003–13.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Huntington JA. Serpin structure, function and dysfunction. J Thromb Haemost. 2011;9:26–34.

    Article  CAS  PubMed  Google Scholar 

  48. Spence MA, Mortimer MD, Buckle AM, Minh BQ, Jackson CJ. A comprehensive phylogenetic analysis of the serpin superfamily. Mol Biol Evol. 2021;38:2915–29.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Abbas MN, Chlastáková A, Jmel MA, Iliaki-Giannakoudaki E, Chmelař J, Kotsyfakis M. Serpins in tick physiology and tick-host interaction. Front Cell Infect Microbiol. 2022;12: 892770.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Leboulle G, Crippa M, Decrem Y, Mejri N, Brossard M, Bollen A, et al. Characterization of a novel salivary immunosuppressive protein from Ixodes ricinus ticks *. J Biol Chem. 2002;277:10083–9.

    Article  CAS  PubMed  Google Scholar 

  51. Chmelar J, Oliveira CJ, Rezacova P, Francischetti IMB, Kovarova Z, Pejler G, et al. A tick salivary protein targets cathepsin G and chymase and inhibits host inflammation and platelet aggregation. Blood. 2011;117:736–44.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Chlastáková A, Kotál J, Beránková Z, Kaščáková B, Martins LA, Langhansová H, et al. Iripin-3, a new salivary protein isolated from Ixodes ricinus ticks, displays immunomodulatory and anti-hemostatic properties in vitro. Front Immunol. 2021;12: 626200.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Chlastáková A, Kaščáková B, Kotál J, Langhansová H, Kotsyfakis M, Kutá Smatanová I, et al. Iripin-1, a new anti-inflammatory tick serpin, inhibits leukocyte recruitment in vivo while altering the levels of chemokines and adhesion molecules. Front Immunol. 2023;14:1116324.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Kascakova B, Kotal J, Martins LA, Berankova Z, Langhansova H, Calvo E, et al. Structural and biochemical characterization of the novel serpin Iripin-5 from Ixodes ricinus. Acta Crystallogr Sect Struct Biol. 2021;77(Pt 9):1183–96.

    Article  CAS  Google Scholar 

  55. Kascakova B, Kotal J, Havlickova P, Vopatkova V, Prudnikova T, Grinkevich P, et al. Conformational transition of the Ixodes ricinus salivary serpin Iripin-4. Acta Crystallogr Sect Struct Biol. 2023;79(Pt 5):409–19.

    Article  CAS  Google Scholar 

  56. Kotál J, Polderdijk SGI, Langhansová H, Ederová M, Martins LA, Beránková Z, et al. Ixodes ricinus salivary serpin Iripin-8 inhibits the intrinsic pathway of coagulation and complement. Int J Mol Sci. 2021;22:9480.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Perner J, Sobotka R, Sima R, Konvickova J, Sojka D, Oliveira PL de, et al. Acquisition of exogenous haem is essential for tick reproduction. eLife. 2016;5:e12318.

  58. Perner J, Provazník J, Schrenková J, Urbanová V, Ribeiro JMC, Kopáček P. RNA-seq analyses of the midgut from blood- and serum-fed Ixodes ricinus ticks. Sci Rep. 2016;6:36695.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Perner J, Gasser RB, Oliveira PL, Kopáček P. Haem biology in metazoan parasites – ‘the bright side of haem.’ Trends Parasitol. 2019;35:213–25.

    Article  CAS  PubMed  Google Scholar 

  60. Hajdusek O, Sojka D, Kopacek P, Buresova V, Franta Z, Sauman I, et al. Knockdown of proteins involved in iron metabolism limits tick reproduction and development. Proc Natl Acad Sci. 2009;106:1033–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Kopáček P, Ždychová J, Yoshiga T, Weise C, Rudenko N, Law JH. Molecular cloning, expression and isolation of ferritins from two tick species—Ornithodoros moubata and Ixodes ricinus. Insect Biochem Mol Biol. 2003;33:103–13.

    Article  PubMed  Google Scholar 

  62. Oh HJ, Jung Y. High order assembly of multiple protein cages with homogeneous sizes and shapes via limited cage surface engineering. Chem Sci. 2023;14:1105–13.

    Article  CAS  PubMed  Google Scholar 

  63. Perner J, Hajdusek O, Kopacek P. Independent somatic distribution of heme and iron in ticks. Curr Opin Insect Sci. 2022;51: 100916.

    Article  PubMed  Google Scholar 

  64. Xu Z, Wang Y, Sun M, Zhou Y, Cao J, Zhang H, et al. Proteomic analysis of extracellular vesicles from tick hemolymph and uptake of extracellular vesicles by salivary glands and ovary cells. Parasit Vectors. 2023;16:125.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Robertson HM, Warr CG, Carlson JR. Molecular evolution of the insect chemoreceptor gene superfamily in Drosophila melanogaster. Proc Natl Acad Sci. 2003;100 suppl_2:14537–42.

  66. Rytz R, Croset V, Benton R. Ionotropic Receptors (IRs): Chemosensory ionotropic glutamate receptors in Drosophila and beyond. Insect Biochem Mol Biol. 2013;43:888–97.

    Article  CAS  PubMed  Google Scholar 

  67. Joseph RM, Carlson JR. Drosophila chemoreceptors: a molecular interface between the chemical world and the brain. Trends Genet. 2015;31:683–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Missbach C, Dweck HK, Vogel H, Vilcinskas A, Stensmyr MC, Hansson BS, et al. Evolution of insect olfactory receptors eLife. 2014;3: e02115.

    PubMed  Google Scholar 

  69. Eyun S, Soh HY, Posavi M, Munro JB, Hughes DST, Murali SC, et al. Evolutionary history of chemosensory-related gene families across the Arthropoda. Mol Biol Evol. 2017;34:1838–62.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Robertson HM, Wanner KW. The chemoreceptor superfamily in the honey bee, Apis mellifera: expansion of the odorant, but not gustatory, receptor family. Genome Res. 2006;16:1395–403.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. McBride CS. Rapid evolution of smell and taste receptor genes during host specialization in Drosophila sechellia. Proc Natl Acad Sci. 2007;104:4996–5001.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. McBride CS, Arguello JR. Five Drosophila genomes reveal nonneutral evolution and the signature of host specialization in the chemoreceptor superfamily. Genetics. 2007;177:1395–416.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Josek T, Walden KKO, Allan BF, Alleyne M, Robertson HM. A foreleg transcriptome for Ixodes scapularis ticks: candidates for chemoreceptors and binding proteins that might be expressed in the sensory Haller’s organ. Ticks Tick-Borne Dis. 2018;9:1317–27.

    Article  PubMed  Google Scholar 

  74. Kent LB, Robertson HM. Evolution of the sugar receptors in insects. BMC Evol Biol. 2009;9:41.

    Article  PubMed  PubMed Central  Google Scholar 

  75. Shim J, Lee Y, Jeong YT, Kim Y, Lee MG, Montell C, et al. The full repertoire of Drosophila gustatory receptors for detecting an aversive compound. Nat Commun. 2015;6:8867.

    Article  CAS  PubMed  Google Scholar 

  76. Kumar A, Tauxe GM, Perry S, Scott CA, Dahanukar A, Ray A. Contributions of the conserved insect carbon dioxide receptor subunits to odor detection. Cell Rep. 2020;31: 107510.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Meslin C, Mainet P, Montagné N, Robin S, Legeai F, Bretaudeau A, et al. Spodoptera littoralis genome mining brings insights on the dynamic of expansion of gustatory receptors in polyphagous noctuidae. G3 GenesGenomesGenetics. 2022;12:jkac131.

  78. van Schooten B, Jiggins CD, Briscoe AD, Papa R. Genome-wide analysis of ionotropic receptors provides insight into their evolution in Heliconius butterflies. BMC Genomics. 2016;17:254.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Min S, Ai M, Shin SA, Suh GSB. Dedicated olfactory neurons mediating attraction behavior to ammonia and amines in Drosophila. Proc Natl Acad Sci. 2013;110:E1321–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Hussain A, Zhang M, Üçpunar HK, Svensson T, Quillery E, Gompel N, et al. Ionotropic chemosensory receptors mediate the taste and smell of polyamines. PLOS Biol. 2016;14: e1002454.

    Article  PubMed  PubMed Central  Google Scholar 

  81. Knecht ZA, Silbering AF, Ni L, Klein M, Budelli G, Bell R, et al. Distinct combinations of variant ionotropic glutamate receptors mediate thermosensation and hygrosensation in Drosophila. eLife. 2016;5:e17879.

  82. Knecht ZA, Silbering AF, Cruz J, Yang L, Croset V, Benton R, et al. Ionotropic receptor-dependent moist and dry cells control hygrosensation in Drosophila. eLife. 2017;6:e26654.

  83. Kopácek P, Hajdusek O, Buresová V, Daffre S. Tick innate immunity. Adv Exp Med Biol. 2010;708:137–62.

    Article  PubMed  Google Scholar 

  84. Wu J, Zhou X, Chen Q, Chen Z, Zhang J, Yang L, et al. Defensins as a promising class of tick antimicrobial peptides: a scoping review. Infect Dis Poverty. 2022;11:71.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Wang Y, Zhu S. The defensin gene family expansion in the tick Ixodes scapularis. Dev Comp Immunol. 2011;35:1128–34.

    Article  CAS  PubMed  Google Scholar 

  86. Després L, David J-P, Gallet C. The evolutionary ecology of insect resistance to plant chemicals. Trends Ecol Evol. 2007;22:298–307.

    Article  PubMed  Google Scholar 

  87. Oakeshott J, Claudianos C, Campbell PM, Newcomb R, Russell RJ. Biochemical genetics and genomics of insect esterases. In: Comprehensive Molecular Insect Science-Pharmacology. Gilbert, L.I., Iatrou K., Gill S.S. Amsterdam, The Netherlands: Elsevier; 2005. p. 309–81.

  88. Rewitz KF, Rybczynski R, Warren JT, Gilbert LI. The Halloween genes code for cytochrome P450 enzymes mediating synthesis of the insect moulting hormone. Biochem Soc Trans. 2006;34:1256–60.

    Article  CAS  PubMed  Google Scholar 

  89. Beugnet F, Franc M. Insecticide and acaricide molecules and/or combinations to prevent pet infestation by ectoparasites. Trends Parasitol. 2012;28:267–79.

    Article  CAS  PubMed  Google Scholar 

  90. Nauen R, Bass C, Feyereisen R, Vontas J. The role of cytochrome P450s in insect toxicology and resistance. Annu Rev Entomol. 2022;67:105–24.

    Article  CAS  PubMed  Google Scholar 

  91. Ahn S-J, Dermauw W, Wybouw N, Heckel DG, Van Leeuwen T. Bacterial origin of a diverse family of UDP-glycosyltransferase genes in the Tetranychus urticae genome. Insect Biochem Mol Biol. 2014;50:43–57.

    Article  CAS  PubMed  Google Scholar 

  92. Perner J, Kotál J, Hatalová T, Urbanová V, Bartošová-Sojková P, Brophy PM, et al. Inducible glutathione S-transferase (GST1) from the tick Ixodes ricinus is a haem-binding protein. Insect Biochem Mol Biol. 2018;95:44–54.

    Article  CAS  PubMed  Google Scholar 

  93. Walsh AM, Kortschak RD, Gardner MG, Bertozzi T, Adelson DL. Widespread horizontal transfer of retrotransposons. Proc Natl Acad Sci. 2013;110:1012–6.

    Article  CAS  PubMed  Google Scholar 

  94. Puinongpo W, Singchat W, Petpradub S, Kraichak E, Nunome M, Laopichienpong N, et al. Existence of Bov-B LINE retrotransposons in snake lineages reveals recent multiple horizontal gene transfers with copy number variation. Genes. 2020;11:1241.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Mans BJ, de Klerk D, Pienaar R, de Castro MH, Latif AA. Next-generation sequencing as means to retrieve tick systematic markers, with the focus on Nuttalliella namaqua (Ixodoidea: Nuttalliellidae). Ticks Tick-Borne Dis. 2015;6:450–62.

    Article  PubMed  Google Scholar 

  96. Rosani U, Sollitto M, Fogal N, Salata C. Comparative analysis of presence-absence gene variations in five hard tick species: impact and functional considerations. Int J Parasitol. 2023. https://doiorg.publicaciones.saludcastillayleon.es/10.1016/j.ijpara.2023.08.004.

    Article  PubMed  Google Scholar 

  97. Gao B, Wang Y, Diaby M, Zong W, Shen D, Wang S, et al. Evolution of pogo, a separate superfamily of IS630-Tc1-mariner transposons, revealing recurrent domestication events in vertebrates. Mob DNA. 2020;11:25.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Wolfe KH, Shields DC. Molecular evidence for an ancient duplication of the entire yeast genome. Nature. 1997;387:708–13.

    Article  CAS  PubMed  Google Scholar 

  99. Kuzmin E, Taylor JS, Boone C. Retention of duplicated genes in evolution. Trends Genet. 2022;38:59–72.

    Article  CAS  PubMed  Google Scholar 

  100. Kenny NJ, Chan KW, Nong W, Qu Z, Maeso I, Yip HY, et al. Ancestral whole-genome duplication in the marine chelicerate horseshoe crabs. Heredity. 2016;116:190–9.

    Article  CAS  PubMed  Google Scholar 

  101. Aase-Remedios ME, Janssen R, Leite DJ, Sumner-Rooney L, McGregor AP. Evolution of the spider homeobox gene repertoire by tandem and whole genome duplication. Mol Biol Evol. 2023;40:msad239.

  102. Schwager EE, Sharma PP, Clarke T, Leite DJ, Wierschin T, Pechmann M, et al. The house spider genome reveals an ancient whole-genome duplication during arachnid evolution. BMC Biol. 2017;15:62.

    Article  PubMed  PubMed Central  Google Scholar 

  103. Van Zee JP, Schlueter JA, Schlueter S, Dixon P, Sierra CAB, Hill CA. Paralog analyses reveal gene duplication events and genes under positive selection in Ixodes scapularis and other ixodid ticks. BMC Genomics. 2016;17:241.

    Article  PubMed  PubMed Central  Google Scholar 

  104. Long M, Betrán E, Thornton K, Wang W. The origin of new genes: glimpses from the young and old. Nat Rev Genet. 2003;4:865–75.

    Article  CAS  PubMed  Google Scholar 

  105. Ohno S. Evolution by gene duplication. Berlin, Heidelberg: Springer; 1970.

    Book  Google Scholar 

  106. Fablet M, Bueno M, Potrzebowski L, Kaessmann H. Evolutionary origin and functions of retrogene introns. Mol Biol Evol. 2009;26:2147–56.

    Article  CAS  PubMed  Google Scholar 

  107. Vinckenbosch N, Dupanloup I, Kaessmann H. Evolutionary fate of retroposed gene copies in the human genome. Proc Natl Acad Sci. 2006;103:3220–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Micheli G, Camilloni G. Can introns stabilize gene duplication? Biology. 2022;11:941.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Lynch M. Gene duplication and evolution. Science. 2002;297:945–7.

    Article  CAS  PubMed  Google Scholar 

  110. Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290:1151–5.

    Article  CAS  PubMed  Google Scholar 

  111. Arcà B, Lombardo F, Struchiner CJ, Ribeiro JMC. Anopheline salivary protein genes and gene families: an evolutionary overview after the whole genome sequence of sixteen Anopheles species. BMC Genomics. 2017;18:153.

    Article  PubMed  PubMed Central  Google Scholar 

  112. Ruzzante L, Reijnders MJMF, Waterhouse RM. Of genes and genomes: mosquito evolution and diversity. Trends Parasitol. 2019;35:32–51.

    Article  CAS  PubMed  Google Scholar 

  113. Chmelař J, Kotál J, Karim S, Kopacek P, Francischetti IMB, Pedra JHF, et al. Sialomes and mialomes: a systems-biology view of tick tissues and tick–host interactions. Trends Parasitol. 2016;32:242–54.

    Article  PubMed  Google Scholar 

  114. Francischetti IMB, Mans BJ, Meng Z, Gudderra N, Veenstra TD, Pham VM, et al. An insight into the sialome of the soft tick, Ornithodorus parkeri. Insect Biochem Mol Biol. 2008;38:1–21.

    Article  CAS  PubMed  Google Scholar 

  115. Parola P, Raoult D. Ticks and tickborne bacterial diseases in humans: an emerging infectious threat. Clin Infect Dis. 2001;32:897–928.

    Article  CAS  PubMed  Google Scholar 

  116. Shakya M, Sharma AK, Kumar S, Upadhaya D, Nagar G, Singh K, et al. Acaricides resistance in Rhipicephalus microplus and expression profile of ABC-transporter genes in the sampled populations. Exp Parasitol. 2023;252: 108584.

    Article  CAS  PubMed  Google Scholar 

  117. Wu M, Zhang Y, Tian T, Xu D, Wu Q, Xie W, et al. Assessment of the role of an ABCC transporter TuMRP1 in the toxicity of abamectin to Tetranychus urticae. Pestic Biochem Physiol. 2023;195: 105543.

    Article  CAS  PubMed  Google Scholar 

  118. Lara FA, Pohl PC, Gandara AC, Ferreira J da S, Nascimento-Silva MC, Bechara GH, et al. ATP binding cassette transporter mediates both heme and pesticide detoxification in tick midgut cells. PLOS ONE. 2015;10:e0134779.

  119. Ali A, Khan S, Ali I, Karim S, da Silva Vaz Jr. I, Termignoni C. Probing the functional role of tick metalloproteases. Physiol Entomol. 2015;40:177–88.

    Article  CAS  Google Scholar 

  120. Francischetti IMB, Mather TN, Ribeiro JMC. Cloning of a salivary gland metalloprotease and characterization of gelatinase and fibrin(ogen)lytic activities in the saliva of the Lyme disease tick vector Ixodes scapularis. Biochem Biophys Res Commun. 2003;305:869–75.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Perner J, Helm D, Haberkant P, Hatalova T, Kropackova S, Ribeiro JM, et al. The central role of salivary metalloproteases in host acquired resistance to tick feeding. Front Cell Infect Microbiol. 2020;10.

  122. Decrem Y, Mariller M, Lahaye K, Blasioli V, Beaufays J, Zouaoui Boudjeltia K, et al. The impact of gene knock-down and vaccination against salivary metalloproteases on blood feeding and egg laying by Ixodes ricinus. Int J Parasitol. 2008;38:549–60.

    Article  CAS  PubMed  Google Scholar 

  123. Becker M, Felsberger A, Frenzel A, Shattuck WMC, Dyer M, Kügler J, et al. Application of M13 phage display for identifying immunogenic proteins from tick (Ixodes scapularis) saliva. BMC Biotechnol. 2015;15:43.

    Article  PubMed  PubMed Central  Google Scholar 

  124. Jarmey JM, Riding GA, Pearson RD, McKenna RV, Willadsen P. Carboxydipeptidase from Boophilus microplus: a “concealed” antigen with similarity to angiotensin-converting enzyme. Insect Biochem Mol Biol. 1995;25:969–74.

    Article  CAS  PubMed  Google Scholar 

  125. Suiko M, Kurogi K, Hashiguchi T, Sakakibara Y, Liu M-C. Updated perspectives on the cytosolic sulfotransferases (SULTs) and SULT-mediated sulfation. Biosci Biotechnol Biochem. 2017;81:63–72.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Meyer M, Kircher M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. 2010;2010:pdb.prot5448.

  127. Aury J-M, Cruaud C, Barbe V, Rogier O, Mangenot S, Samson G, et al. High quality draft sequences for prokaryotic genomes using a mix of new sequencing technologies. BMC Genomics. 2008;9:603.

    Article  PubMed  PubMed Central  Google Scholar 

  128. Alberti A, Poulain J, Engelen S, Labadie K, Romac S, Ferrera I, et al. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition. Sci Data. 2017;4: 170093.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Ranallo-Benavidez TR, Jaron KS, Schatz MC. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 2020;11:1432.

  131. Alonge M, Lebeigle L, Kirsche M, Jenike K, Ou S, Aganezov S, et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 2022;23:258.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  132. Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci. 2020;117:9451–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. Kent WJ. BLAT—the BLAST-like alignment tool. Genome Res. 2002;12:656–64.

    CAS  PubMed  PubMed Central  Google Scholar 

  134. Birney E, Clamp M, Durbin R. GeneWise and Genomewise. Genome Res. 2004;14:988–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Rispe C, Hervet C, de la Cotte N, Daveu R, Labadie K, Noel B, et al. Transcriptome of the synganglion in the tick Ixodes ricinus and evolution of the cys-loop ligand-gated ion channel family in ticks. BMC Genomics. 2022;23:463.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  136. Mott R. EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA. Comput Appl Biosci. 1997;13:477–8.

    CAS  PubMed  Google Scholar 

  137. Dubarry M, Noel B, Rukwavu T, Farhat S, Silva CD, Seeleuthner Y, et al. Gmove a tool for eukaryotic gene predictions using various evidences. F1000Research. 2016;5.

  138. Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014;30:1236–40.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  139. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.

    Article  PubMed  PubMed Central  Google Scholar 

  140. Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G, et al. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol. 2016;17:66.

    Article  PubMed  PubMed Central  Google Scholar 

  141. Holmer R, van Velzen R, Geurts R, Bisseling T, de Ridder D, Smit S. GeneNoteBook, a collaborative notebook for comparative genomics. Bioinformatics. 2019;35:4779–81.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  142. Buchfink B, Reuter K, Drost H-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods. 2021;18:366–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  143. Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol. 2021;38:5825–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  144. Götz S, García-Gómez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ, et al. High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 2008;36:3420–35.

    Article  PubMed  PubMed Central  Google Scholar 

  145. Dunn NA, Unni DR, Diesh C, Munoz-Torres M, Harris NL, Yao E, et al. Apollo: Democratizing genome annotation. PLOS Comput Biol. 2019;15: e1006790.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  146. Drillon G, Carbone A, Fischer G. SynChro: a fast and easy tool to reconstruct and visualize synteny blocks along eukaryotic chromosomes. PLoS ONE. 2014;9: e92621.

    Article  PubMed  PubMed Central  Google Scholar 

  147. Anand L, Rodriguez Lopez CM. ChromoMap: an R package for interactive visualization of multi-omics data and annotation of chromosomes. BMC Bioinformatics. 2022;23:33.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  148. Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38:4647–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Kriventseva EV, Kuznetsov D, Tegenfeldt F, Manni M, Dias R, Simão FA, et al. OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res. 2019;47:D807–11.

    Article  CAS  PubMed  Google Scholar 

  150. Miele V, Penel S, Duret L. Ultra-fast sequence clustering from similarity networks with SiLiX. BMC Bioinformatics. 2011;12:116.

    Article  PubMed  PubMed Central  Google Scholar 

  151. Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  152. Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20:238.

    Article  PubMed  PubMed Central  Google Scholar 

  153. Kuraku S, Zmasek CM, Nishimura O, Katoh K. aLeaves facilitates on-demand exploration of metazoan gene family trees on MAFFT sequence alignment server with enhanced interactivity. Nucleic Acids Res. 2013;41:W22–8.

    Article  PubMed  PubMed Central  Google Scholar 

  154. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25:1972–3.

    Article  PubMed  PubMed Central  Google Scholar 

  155. Lemoine F, Gascuel O. Gotree/Goalign: toolkit and Go API to facilitate the development of phylogenetic workflows. NAR Genomics Bioinforma. 2021;3:lqab075.

  156. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37:1530–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  157. Mendes FK, Vanderpool D, Fulton B, Hahn MW. CAFE 5 models variation in evolutionary rates among gene families. Bioinformatics. 2021;36:5516–8.

    Article  PubMed  Google Scholar 

  158. Revell LJ. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol. 2012;3:217–23.

    Article  Google Scholar 

  159. Paradis E, Schliep K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics. 2019;35:526–8.

  160. Kanehisa M, Sato Y. KEGG Mapper for inferring cellular functions from protein sequences. Protein Sci. 2020;29:28–35.

    Article  CAS  PubMed  Google Scholar 

  161. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  162. Charrier NP, Couton M, Voordouw MJ, Rais O, Durand-Hermouet A, Hervet C, et al. Whole body transcriptomes and new insights into the biology of the tick Ixodes ricinus. Parasit Vectors. 2018;11:364.

    Article  PubMed  PubMed Central  Google Scholar 

  163. Kim D, Paggi JM, Park C, Bennett C, Salzberg SL. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37:907–15.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  164. Jun G, Wing MK, Abecasis GR, Kang HM. An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Res. 2015;25:918–25.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  165. Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10:giab008.

  166. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32:3047–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  167. Liao Y, Smyth GK, Shi W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 2019;47: e47.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  168. Yu G, Smith DK, Zhu H, Guan Y, Lam TT-Y. ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol. 2017;8:28–36.

  169. Wickham H. ggplot2. Cham: Springer International Publishing; 2016.

    Book  Google Scholar 

  170. Nawrocki EP, Eddy SR. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics. 2013;29:2933–5.

  171. Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, et al. Rfam 14: expanded coverage of metagenomic, viral and microRNA families. Nucleic Acids Res. 2021;49:D192-200.

    Article  CAS  PubMed  Google Scholar 

  172. Chan PP, Lowe TM. tRNAscan-SE: searching for tRNA genes in genomic sequences. Methods Mol Biol Clifton NJ. 2019;1962:1–14.

    Article  CAS  Google Scholar 

  173. Aparicio-Puerta E, Lebrón R, Rueda A, Gómez-Martín C, Giannoukakos S, Jaspez D, et al. sRNAbench and sRNAtoolbox 2019: intuitive fast small RNA profiling and differential expression. Nucleic Acids Res. 2019;47:W530–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  174. Kang Y-J, Yang D-C, Kong L, Hou M, Meng Y-Q, Wei L, et al. CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res. 2017;45:W12–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  175. Pertea G, Pertea M. GFF Utilities: GffRead and GffCompare. 2020.

  176. Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  177. Blum M, Chang H-Y, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, et al. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 2021;49:D344–54.

    Article  CAS  PubMed  Google Scholar 

  178. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.

    Article  CAS  PubMed  Google Scholar 

  179. Lee E, Helt GA, Reese JT, Munoz-Torres MC, Childers CP, Buels RM, et al. Web Apollo: a web-based genomic annotation editing platform. Genome Biol. 2013;14:R93.

    Article  PubMed  PubMed Central  Google Scholar 

  180. Almagro Armenteros JJ, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, et al. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol. 2019;37:420–3.

  181. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539.

    Article  PubMed  PubMed Central  Google Scholar 

  182. Lefort V, Desper R, Gascuel O. FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Mol Biol Evol. 2015;32:2798–800.

  183. Ribeiro JM, Hartmann D, Bartošová-Sojková P, Debat H, Moos M, Šimek P, et al. Blood-feeding adaptations and virome assessment of the poultry red mite Dermanyssus gallinae guided by RNA-seq. Commun Biol. 2023;6:517.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  184. Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46:W296-303.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  185. Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31.

    Article  PubMed  PubMed Central  Google Scholar 

  186. Hoy MA, Waterhouse RM, Wu K, Estep AS, Ioannidis P, Palmer WJ, et al. Genome sequencing of the phytoseiid predatory mite Metaseiulus occidentalis reveals completely atomized Hox genes and superdynamic intron evolution. Genome Biol Evol. 2016;8:1762–75.

    Article  PubMed  PubMed Central  Google Scholar 

  187. Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21.

  188. Anisimova M, Gascuel O. Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol. 2006;55:539–52.

    Article  PubMed  Google Scholar 

  189. Giardine B, Riemer C, Hardison RC, Burhans R, Elnitski L, Shah P, et al. Galaxy: a platform for interactive large-scale genome analysis. Genome Res. 2005;15:1451–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  190. Keller O, Odronitz F, Stanke M, Kollmar M, Waack S. Scipio: Using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics. 2008;9:278.

    Article  PubMed  PubMed Central  Google Scholar 

  191. Lefort V, Longueville J-E, Gascuel O. SMS: smart model selection in PhyML. Mol Biol Evol. 2017;34:2422–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  192. Anisimova M, Gil M, Dufayard J-F, Dessimoz C, Gascuel O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst Biol. 2011;60:685–99.

    Article  PubMed  PubMed Central  Google Scholar 

  193. Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol Plant. 2020;13:1194–202.

    Article  CAS  PubMed  Google Scholar 

  194. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32:268–74.

    Article  CAS  PubMed  Google Scholar 

  195. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14:587–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  196. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35:518–22.

    Article  CAS  PubMed  Google Scholar 

  197. Letunic I, Bork P. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 2019;47:W256–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  198. Ixodes persulcatus (tick) 10X Genomics reads. NCBI Project accession: PRJEB67779 [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67779] (2024)

  199. Ixodes persulcatus (tick) genome assembly. NCBI Project accession: PRJEB67789 [https://www.ncbi.nlm.nih.gov/bioproject/PRJEB67789/] (2024)

  200. Ixodes hexagonus (tick) 10X Genomics reads. NCBI Project accession: PRJEB67780 [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67780] (2024)

  201. Ixodes hexagonus (tick) genome assembly. NCBI Project accession: PRJEB67790 [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67790] (2024)

  202. Ixodes pacificus (tick) 10X Genomics reads. NCBI Project accession: PRJEB67781 https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67781] (2024)

  203. Ixodes pacificus (tick) genome assembly. NCBI Project accession: PRJEB67791 [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67791] (2024)

  204. Ixodes ricinus (tick) 10X Genomics and Hi-C reads. NCBI Project accession: PRJEB67782 [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67782] (2024)

  205. Ixodes ricinus (tick) genome assembly. NCBI Project accession: PRJEB67792 [https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB67792] (2024)

Download references

Acknowledgements

Thanks to Olivier Lambert from the Centre Vétérinaire de la Faune Sauvage et des Écosystèmes (CVFSE) at Oniris, Nantes, France, for providing rescued hedge-hogs, on which we sampled I. hexagonus ticks, to Joyce Kleinjan (U. Berkeley, USA) for providing I. pacificus samples, to the Genotoul platform for access to its bioinformatics and calculation resources, to Fabrice Legeai (INRAE, BIPAA) for help with the Apollo database, to Céline Baulard from the Centre National de Recherche en Génomique Humaine (CNRGH) at Evry, France, for providing help with the preparation of 10X libraries, and to two reviewers who helped improving the manuscript.

Funding

This work was supported by the Genoscope, the Commissariat à l’Énergie Atomique et aux Énergies Alternatives (CEA), and France Génomique (ANR-10-INBS-09–08). JP was funded by the Czech Science Foundation 22-18424 M.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: A.C.A, O. P., J.-M. A, and C. R.; methodology: A.C.A, B. N., A. B., M. B., R. O., J.-F. D., T. C., M. M., F. H., G. L. G., J. M. M., M. H., L. S., F. K., P. K., J.-M. A, and C. R; software: A.C.A; analysis and visualization: A.C.A, B. N., A. B., K. L., M. B., N. T., B. I., S. K., C. C., T. C., M. M., F. H., G. L. G., V. M., M. A. J., J. C., M. K., J. M. M., L. S., F. K., P. K., J. P., J.-M. A, and C. R.; resources: A. B., M. B., M. V., C. H., O. P., A. Z.-N., and P. W.; funding acquisition: O. P., P. W. and C. R.; supervision: C. R. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Claude Rispe.

Ethics declarations

Ethics approval and consent to participate

The use of vertebrate animals (mice and rabbits) to maintain the I. ricinus colony at the University of Neuchâtel was performed following the Swiss legislation on animal experimentation. The commission that is part of the “Service de la Consommation et des Affaires Vétérinaires (SCAV)” of the canton of Vaud, Switzerland, evaluated and approved the ethics of this part of the study. The SCAV of the canton of Neuchâtel, Switzerland, issued the animal experimentation permit (NE05/2014).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

12915_2025_2121_MOESM1_ESM.zip

Additional file 1: Tables S1-S26. Table S1- Metrics of gene prediction. Table S2- Changes due to curation. Table S3- Numbers of genes and gene clusters in Chelicerata genomes. Table S4- Contingency Table for gene clustering among Chelicerata genomes. Table S5- Proportions of monoexonic genes. Table S6- Expanded gene families in tick genomes. Table S7- Cystatins inI. ricinus. Table S8- Iripins and serpins in I. ricinus. Table S9: Heme biosynthesis pathway in Chelicerata. Table S10: Chemosensory receptor genes. Table S11: Gustatory receptors in I. ricinus. Table S12: Ionotropic receptors in I. ricinus. Table S13: Defensins. Table S14: Counts for detoxification genes. Table S15: Cytochrome P450s. Table S16: Carboxylesterases. Table S17: Glutathione S-transferases. Table S18: Cytosolic sulfotransferases. Table S19: ATP-binding cassette transporters. Table S20: Metrics of sequencing data. Table S21: List of genome sequences analyzed. Table S22: List of RNA-Seq datasets. Table S23: Atlas of expression for I. ricinus (counts in TPM). Table S24: Structural and regulatory non-coding elements. Table S25: Neuropeptides. Table S26: Kunitz-domain proteins in I. ricinus.

12915_2025_2121_MOESM2_ESM.docx

Additional file 2: Figs. S1-S20. Fig. S1: K-mer distribution and estimation of genome sizes and heterozygosity four four Ixodes species. Fig. S2: Synteny between the genome assembly of I. ricinus and two other tick species. Fig. S3: Phylogenetic tree of five species of the genus Ixodes. Fig. S4: Phylogenetic tree of cystatins. Fig. S5: Heatmap of expression of cystatins. Fig. S6: KEGG map of the porphyrin metabolism pathway (heme). Fig. S7: Evolution of tick chemosensory proteins. Fig. S8: Phylogenetic tree of conserved ionotropic receptors. Fig. S9: Phylogenetic tree of defensins. Fig. S10: Phylogeny and expression heatmap of glutathione S-transferases. Fig. S11: Comparison among three clustering methods, for two multigenic families of I. ricinus. Fig. S12: Predicted structures of I. ricinus neuropeptide precursors. Fig. S13: Evolution of the M13 metalloproteases in the genome of I. ricinus. Fig. S14: Phylogenetic trees of the Kunitz domain-containing proteins. Fig. S15: Phylogenetic tree of Cytochrome P450s. Fig. S16: Heatmap of expression levels for Cytochrome P450s. Fig. S17: Phylogenetic tree of Carboxylesterases. Fig. S18: Heatmap of expression levels for Carboxylesterases. Fig. S19: Phylogenetic tree of ATP-binding cassette transporters. Fig. S20: Heatmap of expression levels for ATP-binding cassette transporters.

12915_2025_2121_MOESM3_ESM.docx

Additional file 3: Supplementary Results. Structural and regulatory non-coding elements; Neuropeptides in the I. ricinus genome; Metallopeptidases; Kunitz-domain proteins; Cytochrome P450s; Carboxylesterases; ATP-binding cassette transporters; Expansion of juvenile hormone acid methyltransferases [27–45].

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cerqueira de Araujo, A., Noel, B., Bretaudeau, A. et al. Genome sequences of four Ixodes species expands understanding of tick evolution. BMC Biol 23, 17 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12915-025-02121-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12915-025-02121-1

Keywords