Molecular characterization and genetic divergence of seven Culex mosquito (Diptera: Culicidae) species using Mt COI gene from Odisha State, India

Culex mosquitoes are involved in the transmission of arboviral diseases worldwide. Bio-ecology and identification of mosquitoes are of paramount importance to develop species-specific vector control strategies. Genetic-based species-specific approaches that reduce the burden of vector-borne diseases are made possible by molecular techniques. In the present study, the mitochondrial COI gene of Culex mosquitoes was used for molecular identification in addition to morpho-taxonomy. Our findings suggested the presence of important Culex mosquito vectors viz, Culex vishnui, Culex tritaeniorhynchus, Culex bitaeniorhynchus, Culex quinquefasciatus, Culex gelidus, Culex fuscocephala, and Culex fuscanus in the southern part of Odisha state, India. We examined the phylogeny and genetic diversity of the above seven different Culex populations from different geographical locations. An average intra-specific K2P distance of the COI gene was found to be 0.9%. Further, to measure the diversity of the Culex population among different geographical strains, haplotype diversity and nucleotide diversity were compared. Culex fuscanus showed high polymorphisms and mutations with high nucleotide diversity (0.013) and the Culex quinquefasciatus showed the lowest variation in P(i), 0.0013 in the intra-population polymorphism analysis of COI sequences. Similarly, the Haplotype diversity (Hd) found in Culex gelidus and Culex fuscocephala with the value of 0.972 and Culex quinquefasciatus (0.583) showed the lowest value of haplotype diversity. A haplotype network was constructed to establish the genealogical relationship between haplotypes. The phylogenetic tree was constructed that produces distinctive conspecific clusters in different Culex species. Population genetic study has illustrated the occurrence of genetic differentiation within the population. The findings of this study contribute to greater evidence that DNA barcode sequences can be used to monitor mosquito species diversity. This study also adds valuable information about the systematics and molecular biology of seven public health important mosquito species acting as a significant vector for Japanese encephalitis in various Asian continents. This information is further used for the effective implementation of region-specific vector control strategies.

The family Culicidae is a large and abundant group that occurs throughout temperate and tropical regions of the world, and well beyond the Arctic Circle. Culicinae is the biggest subfamily of the Culicidae family, having 3,081 no. of species in 110 genera and clubbed into 11 tribes. Approximately 3,554 different types of mosquito species are reported globally (Harbach, 2018), comprising a total of 112 genera, of which nearly three-quarters are native to the humid tropics and subtropical region. In relation to mosquito faunal diversity, India occupied ranks fifth globally (Foley et al., 2007) and includes 393 species in 49 genera and 41 subgenera (Bhattacharyya et al., 2014) out of which the online systematic catalog of Culicidae included 356 mosquito species (Gaffigan et al., 2014).
Culex is one of the largest and most important groups with 770 species divided between 26 subgenera (Harbach, 2016) that act as prime vectors to spread the pathogens to cause diseases like West Nile fever, St. Louis encephalitis, Japanese encephalitis, equine encephalitis, acute encephalitis syndrome, filariasis, etc. Therefore, effective vector control, early detection, as well as ongoing vector surveillance are crucial in the battle against these arboviral diseases. The success of vector control activities, on the other hand, is dependent on the accurate identification of targeted mosquito species, along with the proper understanding of their biology and ecology. Besides, many taxonomists also consider combined behavioral and population biology data in identifying and classifying a species.
Mosquito vector recognition has traditionally relied on the description and study of morphological features. However, mosquito species identification has been more challenging in recent years due to a substantial decrease in the number of taxonomists and other professional experts in species identification. Furthermore, the core morphological characters of mosquito vectors are often damaged at the time of collection and storage, while some are not available in all stages of their development. Besides this, the morphological features used to distinguish adult specimens differ very little between species that only professional mosquito taxonomists can normally distinguish between them.
Taking this into consideration, several researchers have sought to come up with new approaches to optimize and improve the method of species discrimination, making it more easily accessible to non-experts. In addition, sophisticated genetic and biotechnological technologies should be used to obtain a more accurate and complete picture of mosquito diversity in the study region (Attaullah et al., 2021). A significant breakthrough in DNAsequencing technology has encouraged researchers to investigate more biodiversity in identifying and cataloguing organisms using a convenient and productive DNA analysis tool termed DNA barcoding. This tool has been used as a platform that not only improved access to taxonomic information but also reinforced collaborations for all those interested in biodiversity recording, learning, and restoration. In recent times, this DNA barcoding approach employed to compare and differentiate closely related species by amplifying and sequencing a small conserved region of DNA. This technique has been utilized in various taxonomic investigations involving dipteran taxa (Pramual et al., 2011;Stahls et al., 2009) including mosquito vectors (Gonzalez et al., 2010;Laboudi et al., 2011;Ruiz et al., 2010;Ruiz-Lopez et al., 2012). Molecular taxonomic studies on local strains of Canada, India, Pakistan, Persian Gulf region and China (Ashfaq et al., 2014;Azari-Hamidian et al., 2010;Cywinska et al., 2006;Kumar et al., 2007;Wang et al., 2012) were carried out, showing the use of DNA barcoding technique in Culicidae family.
For DNA barcoding analysis, a short DNA segment that is universally available in the target lineages and has enough sequence variance to differentiate species and is being used for DNA barcoding (Hebert et al., 2003). A mitochondrial cytochrome oxidase subunit I gene (mt COI), can be used to distinguish vertebrate and invertebrate individual species (Folmer et al., 1994). Barcoding with the COI gene is also suggested as a standard for cryptic taxa discovery, the association of different life stages of the same species, and wildlife conservation genetics, in addition to the identification of recognized and new species (Trivedi et al., 2016). The phylogenetic signal of COI tends to be stronger than that of the other mitochondrial genes (Strüder-Kypke & Lynn, 2010).
Integrating conventional taxonomy with molecular DNA barcoding techniques could be an important approach for assessing insect vectors and their genetic variations. Further, Genetic variation and population structure analysis are significant features of population genetics in the mosquito vector. Comparative analyses of these factors can therefore aid in elucidating the factors that influence their population levels. Several factors, such as climate change, ecological conditions, natural barriers, anthropogenic activity, and migration can influence the genetic diversity of species. Therefore, understanding the diversity, distribution of mosquito species and their population genetic study is essential for the control of mosquito vectors and mosquito-borne diseases.
Odisha state of India contributes about 261 acute encephalitis syndrome (AES) cases which are about 4.5% of total cases in India and reported one death case in 2021. Likewise, 18 Japanese encephalitis cases, i.e., about 2.4% of total cases in India (NVBDCP, 2021) in 2021. Out of thirty districts of this state, twenty districts Page 3 of 12 Panda and Barik The Journal of Basic and Applied Zoology (2022) 83:41 are reported as endemic for lymphatic filariasis (NVB-DCP, 2021). Earlier, Hazra and Dash (1998) reported 21 species belonging to the Culicinae subfamily from the coastal region of Odisha state. Rajavel et al. (2005aRajavel et al. ( , 2005b reported 74 species belonging to 12 genera from Jeypore Hill tracks. Similarly, Dash and Hazra (2011) reported 22 species under 6 genera from the Puri and Khurda districts of Odisha. 24 Anopheles species were reported from the Ganjam district (Dash, 2014). Relatively higher humidity and availability of stagnant water bodies, throughout the year in southern districts, provide suitable breeding sites for mosquitoes. Therefore, the mosquito faunal diversity in this region is fascinating. Various changes in ecological environments caused by massive deforestation, periodic cyclones, and widespread use of insecticides have occurred over the last 3 decades. However, there is no sufficient recent data on the Culex mosquito diversity of the Southern districts of Odisha state and therefore, we have considered this region as our study area. DNA barcodes and genetic variation analysis of Culex mosquito species found in the studied region will aid in the detection and surveillance of mosquito vectors, assisting in the control of mosquito-borne disease outbreaks. The main objective of this study was to use both morpho-taxonomy and molecular taxonomy to assess the Culex faunal diversity of different regions of the southern districts in Odisha state. Furthermore, using a single mitochondrial COI marker gene, the efficiency of DNA barcoding in quickly identifying and analyzing genetic variation among seven Culex mosquito species was examined.

Collection and rearing of mosquitoes
Different developmental stages of the mosquito species were collected from various breeding sites in the Southern districts of Odisha state ( Fig. 1 and Table 1). Preferences of study sites were given to those areas having earlier records of epidemic conditions. Mosquitoes were collected through various standard methods and reared in an insectary available at the Applied Entomology Laboratory; Post Graduate Department of Zoology, Berhampur University. The immature stages of mosquitoes were reared to adults in the insectary with standard protocol. The adult mosquitoes were used for morpho-taxonomy followed by molecular taxonomy.

Morphological identification
Morphology-based identification of adult mosquitoes was made as per the identification keys described by  Barraud (1934), Reuben et al. (1994) and Tyagi et al. (2015). After morphological identification, the mosquito samples were vouchered and stored in the laboratory. Further, to confirm the morphology-based identification, genomic DNA isolation and gene amplification was performed for molecular taxonomy.

Genomic DNA isolation and quantification
DNA was isolated by the Bender Buffer method (Collins et al., 1987) from the morphologically identified adult mosquito. Qualitative, as well as quantitative measurements of isolated DNA, were carried out by 1% Agarose Gel Electrophoresis and by a Nanodrop spectrophotometer (Thermo Scientific, USA), respectively.

Gene amplification and sequencing
The extracted DNA from all the specimens was subjected to PCR amplification using universal primers as suggested by Folmer et al. (1994), targeting the mitochondrial COI gene. The primers used for COI gene amplification are: The reaction mixture of the COI gene amplification consisting of 1X PCR buffer, 0.5 UTaq DNA, 2.5 mM MgCl 2 , 200 µM dNTPs, 10 pmol of each primer, 100pmol template DNA, total dilution was made up to 25 µl. The thermo-cycling program for the COI gene consisted of 95 ℃ for 5min; 35 cycles of 95 ℃ for 30 s, 51 ℃ for 30 s and 72 ℃ for 1 min. A final extension step of 7 min at 72 ℃ was added after cycling. The amplicons were resolved in 1.5% agarose gel using 1x Tris-Acetate-EDTA (TAE) buffer. The PCR products were sequenced commercially by the Sanger method using ABI Prism 3730XL Big Dye Terminator V3.1 cycle sequencer (Applied Biosystem, USA).

Sequence data analysis
The trace files of the generated COI sequences were trimmed and assembled using Geneious prime 2020.1.1 (www. genei ous. com) software and sequences of low quality were excluded during data analysis. The generated nucleotide sequences from each species were compared with those publicly available barcode sequences on NCBI using the BLASTn tool. The final assembled good-quality sequences were submitted to NCBI to get the accession numbers. A total of 68 Culex COI nucleotide sequences (including the sequences generated in the present study) of all the studied species were retrieved from the Gen-Bank database and used for data analysis. To understand the evolutionary relationships among the studied Culex mosquito species, the generated sequences were subjected to phylogenetic analysis.

Genetic divergence
The Clustal W algorithm was implemented in the software package MEGA X  for the multiple sequence alignment (MSA). The transition/ transversion (ts/tv) bias (R) and intra-specific and interspecific pairwise sequence divergence was inferred using the Tamura-3 parameter (T 92 ) and Kimura 2 parameter model (K 2 P) (Kimura, 1980), respectively. Furthermore, 68 no. of the nucleotide sequences of the studied Culex species were used to determine the intra-population polymorphism using DnaSp v6.12.03, and the number of variable sites, nucleotide diversity, haplotype diversity, parsimony informative sites were also analyzed.

Haplotype network
Haplotype networks of all the studied Culex mosquito species were constructed based on the number of COI nucleotide differences, using the Median-joining network algorithm in Popart software to determine the genealogical inter-relationship between haplotypes.

Phylogenetic analysis
The Maximum-Likelihood (ML) tree of all the studied mt COI genes was constructed in MEGA X. The support for nodes in ML analysis was assessed by bootstrapping with 1000 replicates. The substitution pattern is best described by the model with the lowest Bayesian Information Criterion (BIC) values (Nei & Kumar, 2000). The T92+G model (Tamura 3 Parameter model with Gamma distribution parameter) was adopted for all the three studied gene sequences. 1st + 2nd + 3rd + noncoding codon positions were included in the study and all the positions containing gaps and missing data were eliminated. The phylogenetic tree analysis includes 68 Culex COI nucleotide sequences (both the generated during the current study and retrieved sequences from the database). A sequence from Armigeres subalbatus was used as the out-group.

Morpho-taxonomy based identification
Both mature and immature stages of mosquito species were collected and immature stages were reared up to adults for identification. However, the adult mosquitoes were identified directly based on the morpho-taxonomy and further confirmed by molecular taxonomy using DNA barcoding. Here, we identified seven different Culex mosquito species such as Culex vishnui, Culex tritaeniorhynchus, Culex bitaeniorhynchus, Culex quinquefasciatus, Culex gelidus, Culex fuscocephala, and Culex fuscanus (Which is a homotypic synonym of Lutiza fuscana).

Computational analysis of sequence data
A total of 7 generated COI sequences were used in the current study from seven Culex mosquito species. The generated sequences were found to be AT-rich with an average of 69.2% and an average GC content of 30.8 % (Table 2). Further, none of the generated sequences contain insertion, deletion, or stop codon, which is supporting the origin of the mitochondrial gene. The transition and transversion bias (R) of seven Culex mosquito species were found to be 0.81. The no. of transition between C&T was equal to A&G (i.e., 16.97). The number of transversion between A&T in the sequences was higher (i.e., 8.91) as compared to G & C (i.e., 3.90), which might be due to high A+T content in mitochondrial nucleotide sequences.

The pattern of genetic divergence
The genetic divergence among the selected species of Culex mosquitoes was determined and all the species displayed discriminative estimations of inter and intraspecific divergence. The intra-specific K 2 P distance of the COI gene was < 2% (0.02), with an average of 0.9% (0.009) as mentioned in Table 3. The maximum mean intraspecific distance was observed in Culex fuscanus (1.5%) followed by Cx. fuscocephala (1.2%), while Cx. quinquefasciatus had a minimum intra-specific K 2 P distance of 0.002 (0.2%). The inter-specific K 2 P genetic distances of the COI gene among all the species were more than 2% (0.02), with an average of 8.1% (0.081). The highest K 2 P distance was observed between Cx. gelidus and Cx. fuscocephala (10.3%) and the least was between sister species Cx. tritaeniorhynchus and Cx. vishnui (5.6%) ( Table 3). Haplotype diversity (Hd) and nucleotide diversity (Pi) were used to study the degree of genetic diversity (haplotype differentiation and nucleotide sequence variation) within and between the population by taking 68 no. of COI sequences. Among studied Culex mosquito species, the intra-population polymorphism of COI sequences investigation of Culex fuscanus showed high polymorphisms and mutations with high nucleotide diversity (0.013). On the other hand, a moderate rise in the P(i) was observed in Culex bitaeniorhynchus with a diversity of 0.01, and the Culex quinquefasciatus shows the lowest variation in P(i), 0.0013. Similarly, the highest haplotype diversity (Hd) is found in Culex gelidus and Culex fuscocephala with a value of 0.972 and Culex quinquefasciatus (0.583) shows the lowest haplotype diversity (Table 4).

Haplotype network analysis
For haplotype network analysis, the median-joining network method was performed based on the similarity and differences in the COI gene sequences of all the seven Culex species (Fig. 2). Each sphere in the network diagram represents distinct haplotypes and this number is consistent with the result obtained from the haplotype numbers shown in   pattern of haplotype network. There is no noticeable connection between specific haplotypes and geographic locations among the population of these species, indicating high sequence variation among the haplotypes. The lines in the diagram are proportional to the mutation and the cross lines indicate the rate of mutation (Fig. 2).

Phylogenetic analysis
All the seven identified Culex mosquitoes were clubbed into distinct conspecific clades in the ML tree using COI sequences (Fig. 3). It constitutes seven distinctive conspecific clusters, each comprised of the haplotypes of the respective species. All of these clades were strongly supported by a minimum of 98% bootstrap value. In the ML tree the highest log-likelihood value = − 3845.61. The nucleotide sequence of Armigera subalbatus was taken as the root of the phylogenetic tree.

Discussion
Culex mosquito species are the principal vectors of pathogen that causes Lymphatic filariasis, West Nile fever, St. Louis encephalitis, Japanese encephalitis, etc. Culex mosquitoes are also reported as the potential vectors of Rift Valley fever in Africa (Kenawy et al., 2018) and are also suspected to be a vector of the Zika virus in Brazil (Benelli & Romano, 2017;Guedes et al., 2017;Guo et al., 2016). In 2016, a JE virus outbreak was reported from the southern district (Malkangiri) of Odisha state and the JE virus was detected from the Cx. vishnui mosquito vector from the affected villages of that area (Sahu et al., 2018). Thus, correct identification, distribution, and bioecology of the species are paramount important for conducting programs for the control and prevention of pathogenic diseases that are transmitted by the mosquito vectors, as it allows us to concentrate only on specific mosquito species those spread certain diseases (Murugan et al., 2016). Even though morpho-taxonomy is regarded as the gold standard method for discrimination of mosquito species, it seems to be quite difficult the identification of field-collected mosquitoes as they may be lost some of their important identifying features during handling. For this reason, there is a need for an alternative technique for identification, especially in the case of cryptic and ambiguous species. DNA barcoding technique is proven to be a reliable system of identification. Molecular data are also widely used for producing molecular phylogenetic, phylogeographic, population genetics, and species identification studies. Genetic variation within the species provides knowledge about the origin and migration of the species and contributes support to vector surveillance and vector control strategy. Molecular phylogeny is further utilized for analysis of gene duplication, estimating rates of diversification, polymorphism, recombination, population dynamics, and inferring organismal phylogenies by combining them with other data sources. Earlier studies have evidenced the use of the mitochondrial COI marker to find more biodiversity and increase species richness as compared to conventional taxonomic tools by uncovering undescribed and cryptic species (Hebert et al., 2003;Schmidt et al., 2015;Wilson et al., 2017) and also used to infer the phylogeny of various dipteran taxa within the genera of Aedes, Anopheles and Culex mosquitoes (Ashfaq et al., 2014;Chan-Chable et al., 2019;Panda et al., 2021;Weeraratne et al., 2018). We employed DNA barcoding of a single mt COI gene sequence for seven different species of Culex mosquito in the present study to re-confirm the species identified by morpho-taxonomy. We observed that the composition of generated COI sequences was AT-rich which is in agreement with the findings of Cywinska et al. (2006) and Rivera and Currie (2009) which indicates a general consistency across the dipteran taxa. The current study compared the produced sequences to sequences from various locations in India and other globally available sequences in the database. Based on COI sequence sets, the pairwise genetic divergence of 2-3% has been considered as the threshold value to discriminate between two different species (Hebert et al., 2003) and even for the two closely related mosquito species (Chan et al., 2014;Guany et al., 2015;Kumar et al., 2007).
The transition and transversion bias (R) were determined to understand the pattern of DNA sequence evolution, genetic differences estimation, and analysis of genomic composition as well as mutational positions. In the present study, the transition occurred at a higher frequency than the rate of a transversion, in the mt COI nucleotide sequences, which was also earlier observed by Fitch and by many other scientists (Fitch, 1967;Lyons & Lauring, 2017). Further, the R-value of the COI gene sequences was greater than 0.5, (threshold value) which indicates there is no such bias in transitional and transversional substitution and both are equally probable.
Nucleotide diversity (π) and haplotype diversity (Hd) are two major indicators to calculate the diversity of species populations among different geographical strains. Here, in this study, Culex fuscanus exhibited high polymorphic sites and mutations with high nucleotide diversity p(i) (0.013) in their COI sequences, indicating high genetic variation in this population. On the other hand, the lowest variation in Pi was observed in Cx. quinquefasciatus (0.0013), indicating low genetic diversity of this population, possibly due to population expansion or selective sweep. Further, the complex pattern of the haplotype network of COI gene sequences in most of the species indicates that there is less association between Page 10 of 12 Panda and Barik The Journal of Basic and Applied Zoology (2022) 83:41 certain haplotypes and geographic locations. In addition, a high value of haplotype diversity and a certain degree of genetic divergence were observed in these species. Furthermore, to confirm the morphological identification and to obtain the taxonomic position of the selected taxa, a phylogenetic analysis was carried out. The Maximum-Likelihood tree is a widely constructed phylogenetic tree, especially in mosquito species for illustrating the evolutionary relationship among the targeted groups (Chan et al., 2014;Guany et al., 2015). In the current study, the mt COI-based analysis strongly supports the positioning of all the seven species by forming a distinct conspecific cluster with their respective group. Further, Cx. vishnui and Cx. tritaeniorhynchus sequences were positioned neighbor to each other in the constructed phylogenetic tree, indicating that both come under the same subgroup of Culex vishnui. Similar phylogenetic positioning among the Culex vishnui subgroup was inferred using ITS1 and COI genes by Karthika et al. (2018) and Toma et al. (2000).

Conclusions
The present study describes the Culex mosquito diversity of the Southern districts of Odisha state, India. Our study proved the potency of the DNA barcoding technique for mosquito species identification in addition to the morpho-taxonomy. This study also adds valuable information about the systematics and molecular biology of seven public important mosquito species acting as a significant vector for Japanese encephalitis in various Asian continents. The generated mt COI sequences could be used as reference nucleotide sequences of the respective haplotypes in future mosquito identification studies and will facilitate the conspecific comparison to reveal the appropriate reason for high intra-specific divergence. Furthermore, different climatic conditions and geographical barriers can be responsible for the genetic variations in a species. This information is further used for the effective implementation of region-specific vector control strategies.