The gene is located on the short arm of chromosome 16 at 16p13.1.[5] Its genomic sequence begins on the plus strand at 4,734,242 bp and ends at 4,749,396 bp.[1]
A diagram of C16orf71 and nearby genes on human chromosome 16.[6]
mRNA
Alternative Splicing
Three different protein encoding transcript variants, or isoforms, have been identified for C16orf71.[7] One non-protein coding transcript variant was identified for the gene.[8]
Name
Length (bp)
Protein (aa)
Mass (kDa)
Biotype
Uncharacterized protein C16orf71 (primary assembly)[7]
Uncharacterized protein C16orf71 Transcript-003[8]
3705
No protein
–
Retained intron
Protein
A map indicating the predicted interacting proteins of C16orf71.[3]Evidence of localization at nuclear speckles of the nucleus, indicated by the green spots where in situ hybridization occurred with the antibody.[12]
General properties
The primary encoded protein consists of 520 amino acid residues, 11 total exons, and is 15.14 kb long, with a molecular weight of approximately 55.68 kDa.[1] The predicted isoelectric point was reported to be 4.81, indicating it is relatively unstable.[13] The gene was reported to be well expressed, at 1.1 times the average gene level.[4]
Composition
Alanine was the most abundant amino acid, contributing to 11.54% of the molecular weight of the protein.[13]Serine was the second most abundant, contributing 10.19% to the overall molecular weight.[13] The average Alanine frequency in vertebrate proteins is approximately 7.4% and the average Serine frequency is approximately 8.1%.[14]
Domains
C16orf71 has one identified domain of unknown function, DUF4701, that is conserved in all mammals and some species of reptiles and birds.[1] DUF4701 spans from amino acid residue 21 to 520 in the protein.[1]
The majority of the predicted interactions involved with the protein related to regulation of mitotic processes, cellular differentiation, proliferation, metabolism, and signaling.[3] Additional related processes included the formation and differentiation of B cells, T cells, endothelial cells, endoderm, and endocrine glands.[3]
Adipose tissue development, regulation of nicotinamide metabolism, signal transduction,
cell-cell signaling, and vitamin metabolism.
Subcellular localization
C16orf71 was observed in nuclear speckles of the nucleus through experimental protocols involving fluorescent in situ hybridization with antibodies.[2]Nuclear speckles, also known as interchromatin granule clusters, are enriched in pre-mRNA splicing factors.[16] These highly dynamic structures are located in interchromatin regions of the nucleoplasm in mammalian cells and have been observed to cycle throughout various nuclear regions and active transcription sites.[16]
Structure
Predicted secondary structure for C16orf71 by I-TASSER.
Protein sequences of the gene's mammalian orthologs were analyzed to reveal similar results, while distant reptilian and avian ortholog sequences predicted more regions of beta sheets.[18][19]
Plot indicating the predicted secondary structure of the protein generated by I-TASSER.[17]
Expression
Expression levels of C16orf71 from microarray analysis in obese omental adipose tissue.[20]
DNA microarray analysis from various experiments provided information on the expression levels of C16orf71 in unique, varying conditions.
The gene appears to have higher levels of expression in the omental adipose tissue of obese subjects compared to non-obese subjects.[20]
Expression levels of C16orf71 in the occurrence of HIF-1 alpha/HIF-2 alpha depletion.[21]Expression levels of C16orf71 in sperm with teratozoospermia.[22]
C16orf71 was also observed to have decreased expression when there was a depletion of HIF-1 alpha, HIF-2 beta, or both. HIF, or hypoxia-inducible factors, are responsible for the mediation of hypoxia effects within the body.[23] In addition, HIFs promote clotting and restoration of various epithelial tissues and are vital in the development of mammalian embryos, sperm, and ova.[24]
Data from an experiment also indicated noticeably lower expression of the gene in sperm affected with teratozoospermia, a condition where sperm have abnormal morphology affecting the fertility in males, compared to normal sperm.[22]
C16orf71 was observed to be present in all stages of development, with similar levels of expression throughout.[25]
Bisphenol A is suspected to cause impairment in male reproduction.[27] An experiment utilizing seminiferous tubule culture was conducted to observe the effects on meiosis and potential germ-line abnormalities.[27] Gene expression analysis revealed decrease expression for C16orf71 when exposed to the chemical.[27]
Butyraldehyde has been observed to affect inflammatory responses in bronchial airway tissue on a genetic level.[28] Microarray analysis was used to determine levels of expression in human alveolar epithelial cells after exposure to the compound.[28] Results indicated decreased expression for C16orf71 when exposed to the chemical.[28]
Polychlorinated biphenyl was used in an experiment to determine its effects on external male genital development.[29] Human fetal corpora cavernosa cells were used as the model tissue.[29]Toxicogenomic analysis indicated the chemical affected all genes involved with genitourinary development and revealed lowered expression levels for C16orf71.[29]
Regulation of expression
1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating possibility of regulated alternate expression.[4] A ZNF500 transcription factor binding domain was found on the minus strand within the promoter region of the gene.[30]ZNF500 is predicted to play a role in gene regulation, transcription, and cellular differentiation.[31]
The beginning of the promoter region was predicted to be 117 bp upstream from the 5' UTR of C16orf71 and is 1371 bp long.[30] The region was analyzed for predicted transcription factors and regulatory elements. Predicted transcription factors in the promoter region related to the regulation of the cell cycle, proliferation, apoptosis, and differentiation of sperm and epithelial tissue components.[3]
Orthologs have been identified in most mammals for which complete genome data is available.[32] C16orf71 and its domain of unknown function, DUF4701, was present in mammals.[32] The most distant orthologs identified were reptilian.[32][33]
Molecular evolution
The m value, or number of corrected amino acid changes per 100 residues, for the gene C16orf71 was plotted against the divergence of species in millions of years. When compared to the data of hemoglobin, fibrinopeptides, and cytochrome C, it was determined that the gene has the closest progression to fibrinopeptides, suggesting a relatively rapid pace of evolution. M values for C16orf71 were derived from percentage of identity of species mRNA sequences compared to the human sequence using the formula derived from the Molecular Clock Hypothesis.
^ abcSong, Mi-Kyung; Lee, Hyo-Sun; Ryu, Jae-Chun (2015). "Integrated analysis of microRNA and mRNA expression profiles highlights aldehyde-induced inflammatory responses in cells relevant for lung toxicity". Toxicology. 334: 111–121. doi:10.1016/j.tox.2015.06.007. PMID26079696.
^ abcTait, Sabrina; La Rocca, Cinzia; Mantovani, Alberto (2011-07-01). "Exposure of human fetal penile cells to different PCB mixtures: transcriptome analysis points to diverse modes of interference on external genitalia programming". Reproductive Toxicology. 32 (1): 1–14. doi:10.1016/j.reprotox.2011.02.001. PMID21334430.