Computational analyses of amino acid molecules of heat shock protein-70 for elucidating its evolutionary diversity and protein interactions in selected farm animals

The heat shock protein-70 (HSP70) is a protein associated with response and adaptation to stress, as well as protection of the cells against thermal and oxidative stress in animals. It is an evolutionarily conserved protein, but its expression has been reportedly varied. Therefore, this study implemented computational analyses of the amino acid sequences of this gene for a better understanding of the evolutionary and protein interactions variations associated with the gene to facilitate its exploitation for the breeding of animals with increasing adaptation to heat stress. The result showed that there is a wide evolutionary distance between humans and the selected farm animals studied but elegans shared a common evolutionary relationship with the farm animals. The sequence identity analysis returned exact matches among the sequences as minimum = 8.09%, maximum = 98.58%, and mean ± SD = 71.03 ± 26.3% across all the species, while the sequence similarities resemblance among the sequences were minimum = 16.49%, maximum = 100%, and mean ± SD = 78.99 ± 24.39%. The global block substitution matrix (BLOSUM62) analysis returned minimum = 0.18, maximum = 0.98, and mean ± SD = 0.62 ± 0.34. The analysis of the molecular weight of the protein sequences returned minimum = 5.70 kDa, maximum = 6.41 kDa, mean = 6.28 kDa, and standard deviation 0.17 kDa, and the isoelectric point of the protein sequences was minimum = 4.55, maximum = 7.17, mean = 5.56, and standard deviation = 0.65 while the hydrophobicity of the protein sequences were minimum = 45.20 kcal/mol, maximum = 53.02 kcal/mol, mean = 47.81 kcal/mol, and standard deviation = 1.85 kcal/mol. The outcomes of the computational analyses led to the conclusion that variations exist in the conservations of amino acid residues of the gene in the studied farm and non-farm animals, and this is responsible for the differences and similarities in the expression of the HSP70 gene in different animals. It was also concluded that elegans are suitable model that could be exploited for a better understanding of response and adaptation to heat stress in duck, chicken, cattle, sheep, and goat when focusing on regulation and expression of heat shock protein gene 70 (HSP70).

adaptation to heat stress (Archana et al., 2017). One of the most prominent members of this family is heat shock protein-70 (HSP70) which is responsible for controlling the activity of key signaling proteins involved in stress adaptation by maintaining these proteins in an inactive or active state while regulating their abundance and intracellular transportation (Malyshev, 2013). Heat stress appears to be one of the major climate change factors negatively affecting livestock production; hence, the breeding of thermo-tolerant animals to sustain livestock production is highly desirable (Sejian et al., 2018). This is because, climate change is one of the biggest challenges facing livestock production, it is almost inevitable in the tropics due to elevated temperature which induce oxidative stress and act as a promoter of heat stress in farm animals (Sikiru et al., 2020).
Meanwhile, heat stress hampers the animals' cellular functions, reduces feed intakes, and alters metabolic cum digestive activities, negatively affecting reproduction, productivity in terms of growth, and reduce the economic gain of the livestock producers (Madhusoodan et al., 2020). However, the expression of genes responsible for the regulation of the animals' response to heat stress varies for different organisms, and these variations correlated with the animals' physiological and genetic adaptations to heat stress (Krebs & Feder, 1997). For instance, increased expression of the gene HSP70 was reported in some heat-tolerant breeds of goat, while it was otherwise for buffalo and cattle (Madhusoodan et al., 2020).
Although, the expression of these stress response genes occurs naturally when organisms are exposed to heat stress and when they are not exposed to heat stress, but the patterns of these expressions vary and strongly correlated with resistance of the organism to stress (Mota et al., 2019). Despite all these findings, one of the most significant questions that remained unanswered is the evolutionary mechanisms underlying the diversification of these genes (Chen et al., 2018;Krebs & Feder, 1997). Therefore, this study implemented computational analyses of the sequences of amino acid molecules of the gene HSP70 retrieved for different farm animals for a better understanding of the evolutionary diversity and protein interactions variations associated with the gene in the selected farm animals to facilitate its exploitation for breeding of animals with increasing adaptation to heat stress.

Methods
The analysis of the evolutionary diversity associated with the HSP70 protein sequences of the selected farm animals and other organism investigated was carried out using Maximum Likelihood method and Dayhoff Fig. 1 The dendrogram of evolutionary relationships among the selected farm animals, humans, and elegans. The evolutionary relationship was built using the Maximum Likelihood method and Dayhoff matrix-based model. A discrete Gamma distribution was used to model the evolutionary rate differences among sites [5 categories (+ G, parameter = 4.8621)]. The tree was drawn to scale, with branch lengths measured in the number of substitutions per site Page 3 of 9 Sikiru et al. JoBAZ (2021)  matrix-based model (Schwarz & Dayhoff, 1979). The analysis involved the use of the sequences of the amino acid molecules retrieved for the selected organisms.
There was a total of 1257 positions in the final dataset of the retrieved sequences which were analyzed in 50 bootstraps using MEGA-X software (Kumar et al., 2018). For the sequences alignment, first, there was a retrieval of amino acid sequences for HSP70 from the Uniprot protein database for each of the selected farm animals including cattle (Q27975), chicken (P08106), pig (P34930), turkey (G1MSW3), sheep (W5PTR5), goat (Q9TUG3), camel (F5CV63), donkey (A6N8F3), ducks (A0A493TMV7), rabbits (G1T1V9), mouse (P17879), rat (P0DMW1), human (P0DMV8), and a non-mammalian species elegans (P09446). Then the sequences obtained were subjected to alignment and hierarchical clustering using software MultiAlin version 5.4.1 (Corpet, 1988). There was also an examination of the sequences similarities using the Sequence Identity and Similarity (SIAS) tool of the Immunomedicine Group of Universidad Complutense de Madrid, Spain using the default BLOSUM62 scoring matrices. The physicochemical properties of the peptides sequences were determined using PepDraw (White & Wimley, 1998), the functional enrichment and  inter-protein analysis was carried out using STRING, a database for prediction of protein-protein interactions (Szklarczyk et al., 2019 https:// string-db. org/); and the Venn diagrams were constructed with the tools available at http:// bioin forma tics. psb. ugent. be/ softw are/ detai ls/ Venn-Diagr ams which is a web-based application.

Results
This study retrieved amino acid sequences of farm animals including cattle, sheep, pig, turkey, sheep, goat, camel, donkey, rabbit, and duck compared with non-farm animals, including mouse, rat, human, and elegans as outliers giving a total of 14 organisms with their respective amino acid sequences (Additional file 1). Phylogenetical evolutionary diversity analysis of the amino acid sequence retrieved in this study indicated that there is a wide evolutionary distance between humans and some of the farm animals studied while elegans shared a common evolutionary relationship with some of the farm animals ( Fig. 1).
The phylogenetic tree of evolutionary relationships indicated that all the selected farm animals and their comparative mammalian species (rat, mouse, and human) as well as the elegans had a common root of ancestry. However, there is a common ancestor for Sus scrofa (pig), Rattus norvegicus (rat), Mus musculus (mouse) while Equus asinus (donkey) and Oryctolagus cuniculus (rabbit) had same ancestry with Homo sapiens (human). All other farm animals investigated including Meleagris gallopavo (turkey), Camelus dromedarius (camel), Anas platyrhynchos (duck), Gallus gallus (chicken), Bos taurus (cattle), Ovis aries (sheep), Capra hircus (goat) had same ancestry. The phylogenetic analysis indicated evolutionary diversity but both the farm animals and their comparative mammalian species had a common ancestry regarding the heat shock protein-70 amino acid sequences (Fig. 1).
The result also showed that there are 8 amino acids conserved across the evolutionary changes; and the  conserved bases fall within 233 and 449 bases. The multiple sequence alignment (MSA) indicated that amino acids including Glycine (G), Alanine (A), Leucine (L), and Aspartic acid (D) were evolutionarily constantly conserved amino acid residues. While Glutamic acid (E), Glutamine (Q), Isoleucine (I), and Valine (V) were conserved in variation across the evolutionary changes (Fig. 2). The sequence identity analysis returned exact character matches among the sequences (min = 8.09%, max = 98.58% and mean ± SD = 71.03 ± 26.3%) across all the species, while the sequence similarities resemblance among the sequences were (min = 16.49%, max = 100%, and mean ± SD = 78.99 ± 24.39%). The global block substitution matrix (BLOSUM62) analysis returned (min = 0.18, max = 0.98, and mean ± SD = 0.62 ± 0.34) as presented in Tables 1, 2, and 3 respectively, for identity, similarities and BLOSUM62 comparisons. The physicochemical properties of the amino acid sequences measured the protein molecular mass, isoelectric points, net electric charge, and hydrophobicity per molar mass of the proteins (Table 4). The molecular weight (Mw) of the protein sequence (min = 5.70 kDa, max = 6.41 kDa, mean = 6.28 kDa, and standard deviation 0.17 kDa), the isoelectric point (IP) of the protein sequence (min = 4.55, max = 7.17, mean = 5.56, and standard deviation = 0.65) while the hydrophobicity of the protein sequence (min = 45.20 kcal/mol, max = 53.02 kcal/mol, mean = 47.81 kcal/mol, and standard deviation = 1.85 kcal/mol). While the animals have closely related physicochemical properties, the result of their inter-protein interactions showed that there is a significant variation in interactomes of HSP70 (Fig. 3a, b).
The evolutionary relationships based on nodes of 11 inter-protein interactions showed that the studied animals shared multiple protein interaction similarities. Comparing the interactions between cattle, pig, and rat since they shared the same source of evolutionary origin, the inter-protein analysis showed that they have 5 interaction similarities, cattle and rats have 10 interaction similarities but there were no interaction similarities between pig and rat (Fig. 4A). Comparing mouse, chicken, and rabbit; the animals shared 5 interaction similarities, mouse and rabbit there was no interaction similarities, chicken and rabbit have 7 interaction similarities  4B). Comparing the evolutionary distances of elegans, sheep, and humans from other animals, there were 4 interaction similarities while there were 2 interaction similarities between elegans and sheep, but there were no similar interactions between elegans and humans as well as sheep and humans (Fig. 4C). However, some of the animals despite their evolutionary relationships and have some unique interactions that they were not sharing with others.

Discussion
These computational analyses revealed that there were significant evolutionary diversities in the gene HSP70 in the selected farm animals and their comparative non-farm animals as a result of evolutionary changes. This observation agreed with the submission of Krebs and Feder (1997) which stated that heat shock proteins are among the most ancient and highly conserved of all proteins. However, the variation observed via the outcome of multiple alignments could be attributed to the differences associated with the animals' speciation. The significance of these variations could be the basis of the differences in cellular, tissues, organs, and entire organism response to stress, tolerance to increasing temperature, oxidative stress, and endotoxins; as well as recovery from heat stress shock by these organisms (Dangi et al., 2017). Furthermore, these could also explain the differences observed in the expression of heat shock protein genes in animals under similar or different heat stress conditions as reported by Madhusoodan et al. (2020).
The variations observed in this study also support outcomes of studies on the effect of heat stress on the activities of endocrine glands and hormones involved in reproductive activities in male animals which showed variability and conflicting results (Boni, 2019). Similarly, it also support differences and similarities observed in the alteration of follicles and oocytes maturation mechanisms, delayed luteolysis, and ovulatory inhibition due to heat stress in mouse models and cows despite being animals of different species (Dalanezi et al., 2019). Furthemore, the variation can also be used to explain the difference in outcomes of heat stress effect on sperm viability, integrity, and function, as well as complication of the developmental potential of the blastocyst which were clearly different in different animal species (Boni, 2019). Furthermore, the variation observed in the amino acid sequences of the gene HSP70 investigated in this present study could serve as a genetic support for upholding the reported variations in the reproductive performances mentioned above in farm animals due to heat stress. This is because of the key roles HSP70 play in animals' response to heat stress which was reported to be compromised in the studies.
This study also suggest possible similarities in the function of the gene HSP70 in the selected farm animals investigated due to the observed similarities in character matches, amino acids, and the global similarities index of the amino acid molecules. The observed similarities in rat, mouse, and elegans and selected farm animals including cattle, pig, camel, and chicken imply the suitability of rat, mouse, and elegans as models for studying heat stress in these farm animals. This is because, the gene HSP70 could function similarly in these animals and the models which can speed up research focusing on tolerance and adaptation of the selected farm animals to heat stress. Furthermore, similarities observed in the physicochemical properties of the genes' amino acid molecules indicated that there is a conservation of important biochemical functions, response, and adaptation of the closely related animals to stress (Ajayi et al., 2018).

Conclusion
The study attempted an elucidation of the variations in heat shock protein-70 (HSP70) of selected farm and non-farm animals through evolutionary diversity and inter-protein interactions of the gene products. The study suggest the need for animals' adaptation to heat stress as a highly desirable need for sustainable livestock production. Following outcomes of the computational analyses carried out in this study, it was concluded that variations and conservations of amino acid residues of HSP70 could be a genetic basis for the differences and similarities in the function of HSP70 gene in different animals investigated. It was also concluded that elegans are suitable models that can be exploited for studying heat stress in farm animals including duck, chicken, cattle, sheep, and goat.