CAP-Gly Domain Containing Linker Protein Family Member 4 is a protein that in humans is encoded by the CLIP4 gene.[6] In terms of conserved domains, the CLIP4 gene contains primarily ankyrin repeats and the eponymous CAP-Gly domains.[6] The structure of the CLIP4 protein is largely made up of coil, with alpha helices dominating the rest of the protein.[7] CLIP4 mRNA expression occurs largely in the adrenal cortex and atrioventricular node.[8] The literature encompassing CLIP4's conserved domains and paralogs points toward microtubule regulation as a possible function of CLIP4.
Gene
The human CLIP4 gene, also known as Restin-Like Protein 2 (RSNL2),[9] is located on the plus strand of the short (p) arm of chromosome 2 at region 2, band 3[9] from base pair 29,096,676 to base pair 29,189,643. CLIP4 is 92,968 base pairs in length and consists of 23 exons.[9]
The human CLIP4 protein is 705 amino acids in length and is composed of two main types of conserved domains: Two CAP-Gly domains and numerous ankyrin repeats.[9] The secondary structure of CLIP4 consists largely of random coil, with alpha helices as the second-most abundant structure and beta sheets as the third-most abundant structure.[7]
The isoelectronic point of the unprocessed CLIP4 protein is slightly basic (8.62 pI), meaning there is a slight excess of basic amino acids compared to acidic amino acids.[13] The molecular weight is about 65 kD.[13] The most abundant amino acid in CLIP4 is Serine, which makes up 10.7% of the protein.[14] Aligned matching blocks of separated, tandem, and periodic repeats are found between positions 340-345 and 542-547, as well as 447-547 and 564-568.[14] The unusual 9-figure periodic element of a singular Lysine followed by eight other amino acids occurs five times within the protein when compared to the swp23s.q dataset.[14] Another unusual phenomenon is a 7-figure periodic element of a negatively charged amino acid followed by six other hydrophobic amino acids, which occurs six times within the protein when compared to the swp23s.q dataset.[14] There are two instances of Serine spacing and two instances of Phenylalanine spacing that comprise unusually large distances when compared to the swp23s.q dataset.[14]
CLIP4 RNA expression is consistently measured to a high degree in the thyroid.[6] Additionally, high degrees of transcription occur in the adrenal cortex and atrioventricular node.[8] The Human Protein Atlas points toward high RNA expression values in the muscle tissues, as well as some in the skin, endocrine tissues, and proximal digestive tract.[17] Greatest protein expression values appeared in the muscle tissues as well, in addition to some in the lung, gastrointestinal tract, liver & gallbladder, and bone marrow & lymphoid tissues.[17]
CLIP4 protein expression seems to be highly expressed during Ada3 deficiency.[18] There also exists a higher trend towards higher CLIP4 expression in the absence of U28.[18]
Regulation
Gene
Common transcription factor binding sites
These transcription factors were chosen and organized based on proximity to the promoter and matrix similarity.[19]
Transcription Factor
Detailed Matrix Info
Anchor Base
Matrix Similarity
Sequence
NOLF
Early B-cell factor 1
17
0.98
taagagTCCCcagggcagaaaca
PAX2
Zebrafish PAX2 paired domain protein
18
0.8
aagagtccccagggcagAAACaa
AP2F
Transcription factor AP-2, alpha
16
0.98
ctgcCCTGgggactc
AP2F
Transcription factor AP-2, beta
16
0.899
gagTCCCcagggcag
SORY
SRY (sex-determining region Y) box 9, dimeric binding sites
35
0.768
aAACAaaatccagtgagggagag
HNF6
CUT-homeodomain transcription factor Onecut-2
32
0.827
aaacaaAATCcagtgag
PAX5
B-cell-specific activator protein
40
0.815
acaaaaTCCAgtgagggagagatgcaggg
ZF16
PR/SET domain 15
36
0.852
aaatccagtgaGGGA
SORY
HMGI(Y) high-mobility-group protein I (Y), architectural transcription factor organizing the framework of a nuclear protein-DNA transcriptional complex
78
0.945
tggaAATTttctaccttaggagc
NFAT
Nuclear factor of activated T-cells 5
83
0.955
ttttGGAAattttctacct
NFAT
Nuclear factor of activated T-cells 5
83
0.871
aggtAGAAaatttccaaaa
CEBP
CCAAT/enhancer binding protein (C/EBP), epsilon
89
0.975
agccttttGGAAatt
CAAT
Cellular and viral CCAAT box
110
0.91
gcagCCATttaatct
CAAT
Avian C-type LTR CCAAT box
165
0.875
cccaCCAAgcagtgg
CEBP
CCAAT/enhancer binding protein (C/EBP), gamma
650
0.866
ctaaTTGCtcaacgt
CEBP
CCAAT/enhancer binding protein alpha
651
0.971
cacgttgaGCAAtta
VTBP
Mammalian C-type LTR TATA box
680
0.903
tgctgTAAAaggcctaa
TF2B
Transcription factor II B (TFIIB) recognition element
983
1
ccgCGCC
TF2B
Transcription factor II B (TFIIB) recognition element
1157
1
ccgCGCC
TF2B
Transcription factor II B (TFIIB) recognition element
1228
1
ccgCGCC
Transcriptional
The human CLIP4 mRNA sequence has 12 stem-loop structures in its 5' UTR and 13 stem-loop structures in its 3' UTR. Of those secondary structures, there are 12 conserved stem-loop secondary structures in the 5'UTR as well as 1 conserved stem-loop secondary structure in the 3' UTR.[20]
Protein
The human CLIP4 protein is localized within the cellular nuclear membrane.[21] CLIP4 does not have a signal peptide due to its intracellular localization.[22] It also does not have N-linked glycosylation sites for that same reason.[23] CLIP4 is not cleaved.[24] However, numerous O-linked glycosylation sites are present.[25] A high density of phosphorylation sites are present in the 400-599 amino acid positions on the CLIP4 protein, although many are also present throughout the rest of the protein.[26]
Function
CAP-Gly domains are often associated with microtubule regulation.[27] In addition, ankyrin repeats are known to mediate protein-protein interactions.[28] Furthermore, CLIP1, a paralog of CLIP4 in humans, is known to bind to microtubules and regulate the microtubule cytoskeleton.[29] The CLIP4 protein is also predicted to interact with various microtubule-associated proteins.[30] As a result, it is likely that the CLIP4 protein, although uncharacterized, is associated with microtubule regulation.
Interacting Proteins
The CLIP4 protein is predicted to interact with many proteins associated with microtubules; namely, MAPRE1, MAPRE2, and MAPRE3. It is also predicted to interact with CKAP5 and DCTN1, a cytoskeleton-associated protein and dynactin-associated protein respectively.[30]
Clinical significance
Importance in various cancers
CLIP4 activity is correlated with the spread of renal cell carcinomas (RCCs) within the host and could therefore be a potential biomarker for RCC metastasis in cancer patients.[31] Additionally, measurement of promotor methylation levels of CLIP4 using a Global Methylation DNA Index reveals that higher methylation of CLIP4 is associated with an increase in severity of gastritis to possibly gastric cancer.[32] This indicates that CLIP4 could be used for early detection of gastric cancer.[33] A similar finding was also documented for prostate cancer, in which CLIP4 was found to be hypermethylated in patients with prostate cancer.[34]
Importance in other diseases
The presence of CLIP4 was found to be highly increased in samples with predicted severe fibrosis as a result of Chronic Hepatitis C virus (HCV).[35] Additionally, the presence of CLIP4 as a novel self-antigen in Systemic Lupus Arythematosus points to it having a potential role in the disease mechanism.[36]
Homology
CLIP4 orthologs
These orthologs were chosen and organized based on estimated date of divergence from the human protein as well as the global sequence identity.[37]