Fold your protein with AlphaFold or ESMFold or Boltz and compare it to the real structure.
Comment on:
Any predicted vs. experimental differences.

Comparison between MSI2_6DBP and MSI2 reveals a key difference in their sequence, where MSI2 has an extension in its N-terminal region with the sequence “MEANGSGGTSGANS”. This variation suggests that MSI2 is a longer isoform, which could influence its biological structure and function. Although both proteins share a similar core domain, the presence of this additional region in MSI2 could affect its stability, molecular interactions or subcellular localization, highlighting the importance of analyzing its functional impact.
Low-confidence regions and why do you think they are low confidence?
align MSI2 and resi 10-100, MSI2_6DBP and resi 10-100 rmsd_value = 25.273 print("RMSD:", rmsd_value) length_proteina_MSI2 = cmd.count_atoms("MSI2") length_proteina_MSI2_6DBP = cmd.count_atoms("MSI2_6DBP") rmsd_max = 100.0 percentage_similarity = (1 - (rmsd_value / rmsd_max)) * 100 print("Porcentaje de similitud:", percentage_similarity) RMSD: 25.273
pDDLT: 74.72%
The aligned region is yellow, the light blue region is MSI2 protein and the green region is MSI2_6DBP.

The RMSD value obtained between MSI2 and MSI2_6DBP proteins was 25.273, corresponding to a structural similarity of 74.72%. Low-confidence regions are mainly observed between residues 10 and 100, which may be due to high structural flexibility, sequence variability or lack of complete experimental data in these regions.
Inverse-fold your structure with ProteinMPNN
When passing the file, the following sequence were obtained
'ERCERCCCRRRESVCRCERYFSKFGEIRECMVMRDPTTKRSRGFGGCGCSDPECHVRARRRKSYCGCCTYGECVEEPPCPPVVPSCSRSCCCVYCCRCYYRRRRRVNSSGLGWYTYYSGGSCCPYCYEYFSKGSSCCECSCCSCRRPTGCSGWGYVCFPDPECVRRWKECGTFTINGYYATLWPGPTPRRSRTTPGKRRTRGRGKPPLPPLPPPPEGLPLPPPGPLLPPGPLLPPRPGPPEPDPPEEPDPPEERPPPRRRRGRRPPPAPPPAPPPPLPPPPLPPPLLPGPPAPAPGPAPPPGAPAPTPEPEPPEEPPEPEDPEEEPEE'}, {'score': 1.3179940900125404, 'seqid': 0.12582781456536993, 'seq': 'ERCEGCCCRCRSSCCGCCRYFSKFGEIRECMVMRDPTTKRSRGFGGCGCACPECCRKYRSLPSHCGCGTHGCCGKKKKCPLKVCRNGRSWEEYWSCRCYYCRRRRVCSSGVGWYSWWSWCPWDEYVLDYFSKCCEEGESSFWSRRRDTWYPGWGTVCLPKPECVKKYSECGTLYINGYWATLSPGPTPRTKRWTPRKSTRRRRRKPPLPPPPPPPPGPPPPEPAPAPRRAPARRREPAPRRRRRPRRRRRRRRRRPRRGRRRPGTPEEPPPEPPPPPPPPPPPPPPPPPPPAPAPGAPPAPPAPAPTPAPAPAEAPPGPEEPEPEPGE'}, {'score': 1.3802511290936343, 'seqid': 0.13245033112144203, 'seq': 'ERCEGSCCRCGHSHCGCERYFSKFGEIRECMVMRDPTTKRSRGFGGRGCADPECYRRYLALPSYCGCCCYGECEEPEPCPVVEPRSSRSCKKVYCCRCYYCRRRRVCCSGVGVYSYYSGGWSEPYVLEYFSKGSKKSECSFCGRCRRGGYWGYGTVCLPCPECVETWSECGTYYIDGYYATLSCGPTPRRRRGRPRRRTTRGRTRAPLPPGPPPAPAAPEPAPGPEPPPAPAEPEEPAPEEEEPPEEEEPPEEEEPPPRRRGPGEPSGPPPAPAPAPPEPPAPPGPPPPPAAPAPGPAPAPAAPAPAPAPAPAPAPPGPAAPAPAPGE'}, {'score': 1.342320524333683, 'seqid': 0.11258278145322573, 'seq': 'EWCECSCCCRKCECCGCCCYFSKFGEIRECMVMRDPTTKRSRGFGGHGCACPECGIKALKRKSWGGCCTYGHCRKKKKCPKVVPKYGRSWRRYYCCRCYYCRHCCVNSSGLGTYTYYSGGCYCPYCYEYFSKCSKRSECSCCSRCRPLGSCGTGFICLPKPECVKKLEECGYYYINGYYATYAPGPKKKKKRRRPRKKTRRRRGRPPLPPPPGPPPGPPPPAPEPEPPEEPEEEEEEEPEEEEPEEEEEEPEEEEERPEPETPGRPKEPKPEPAPAPPPPPPAPGGPPGPPPPLPGPPPPPPEPAPAPLPEPPPEPPAPAAPAPAPGA'}, {'score': 1.3748043434892951, 'seqid': 0.1490066225116223, 'seq': 'ERCERCCCRRRHSCCGCERYFSKFGEIRECMVMRDPTTKRSRGFGGVGCSCCSCCRKARSLKSHCGCCTYGECEEVKPCPPVVPRDSRSCKKYWSCRCYYCRRRRVCCRCCGVITWWSGGLDCPYCLEVVSKCSKKKKSSCCSSRRSTGSLGWGTVCLDCPDCVNKLSDCGFIYIDGYWATLSCGGVTTSRRRRPGKSTRRGRRKPPLPPPPGPEPGLPPPLPAPLLPPAPLLPPAPLPPAPAPPPAAAPPPAAAPGSGPGTPGTPPAPPPGPRPPPPPPPLPPGPPPGPAEPAPGEGPAPGAPAPAPAPAPAAAPPAPAAPAPEPAA'}, {'score': 1.3323767212647997, 'seqid': 0.13576158939947808, 'seq': 'ERCGGCCCRCGVSVCGCCCYFSKFGEIRECMVMRDPTTKRSRGFGGCGCACPECCRKARELPSLCGCCTYGYCKKKKKCPKIVPSSSRSFCCIFCCRCYYCRHSRVTSSGYGEYTFYSGGCSCPYCLEYFSKGSECCECSCFSCCRPTSGCGEGTVCLPKPECVNKYSECKYYFINGYFATYSNGPTTTTPRSTPGAATTRGGTAPPLPPAPGPPPGLPPPEPGPRPPAGPRRPARPPRRAPEPRAAPEPRAARRPRRRRRRRRRPPAPPPAPAPAPPPPLPPPGPPPGPAAPEPGPGPPPGPPRPEPAPEPLPAPPGPAAPGPAPGS'}, {'score': 1.339978128972667, 'seqid': 0.11920529800929784, 'seq': 'EWCERSCCRCSCPCCGCCCYFSKFGEIRECMVMRDPTTKRSRGFGGYGCSCPECCRKARARKSYCGCCTYGNCSKKEKCPEIVPRSGRSSESVSSCRTYYCRHRRVTSSGVGSYTSSSGGSSEPYWLEVFSKCSEESKSSCSSRRRSSSYSGSGTVCLPKPECVKKWSECGTIYINGYSGTLSLGPTPTSRRSRPTRTTTRGRTRAPLPPPPGPAPAAPPPDPEPEPPAEPEEPRAPEEESRAPEEERRPEEEPRRGPTPTTPGTPAAPPPGPPPPAPPPPPAPAAPPGPAAPLPGAAPAPPEPAPEPAPAPPAAPPAPAAPAPAPAA'}, {'score': 1.3738513500128964, 'seqid': 0.11920529800929784, 'seq': 'EWCERCCCRRGCSCCECCCYFSKFGEIRECMVMRDPTTKRSRGFGGCGCSDPLCGIKALSLPSHCGCCTFGNCVKEEPCPPVVPGTSRSCGGYTCCRCYYCRCRCVNSRGVGTYTTSSGGSTCPYCLEVFSKGSVCGKCYCCSRRRDDGNSGTGTVCLPKPECVKKWCECGTVTINGVTATLYPGPTPRDRRTTRGTSTRRGRRTPPLPPPDPPEPGLPPPLPLPLPPLLPARRRRPRPRAREPREAAEPREEAEPRPEPGTPRTPPAPPPAPAPPPPDPPPLPPPPPGPAEPAPGEGPPPLEPRPTPAPEPEEEPPEPEEPPPEPGA'}]]
Why yes or no?
The sequence obtained by the ProteinMPNN process is not the same as the original sequence. The differences between the original sequence and the obtained sequence may be due to several factors. The protein folding process may modify the sequence to adapt to a more stable three-dimensional structure. In addition, folding algorithms, although accurate, generate approximations that may cause small variations. Post-translational modifications or rearrangements in the sequence may also play a role, and the presence of repetitive sequences could be interpreted by the algorithm as part of the final structure. These factors explain the observed discrepancies.
Oirignal sequence
MEKAMNFIGSQGCLSTWQGSTANPDSQLRHDYPYGFKSKMFGIEGIGIRLESCWQTMYSRPPDSLTRDKYFPSRGKFGEFTVIREFACDMVFAMRSDVDPKTVLKRGGSRPGHHFECFLTVSTKFATIDIPAPKSVADKFPVRLGAQQPHPHKELDSKTIDPKVAFPRRAQPKMYTRTKKIFVCGGMEKAMNFIGSQGCLSTWQGSTANPDSQLRHDYPYGFKSKMFGIEGIGIRLESCWQTMYSRPPDSLTRDKYFPSRGKFGEFTVIREFACDMVFAMRSDVDPKTVLKRGGSRPGHHFECFLTVSTKFATIDIPAPKSVADKFPVRLGAQQPHPHKELDSKTIDPKVAFPRRAQPKMYTRTKKIFVCGMEKAMNFIGSQGCLSTWQGSTANPDSQLRHDYPYGFKSKMFGIEGIGIRLESCWQTMYSRPPDSLTRDKYFPSRGKFGEFTVIREFACDMVFAMRSDVDPKTVLKRGGSRPGHHFECFLTVSTKFATIDIPAPKSVADKFPVRLGAQQPHPHKELDSKTIDPKVAFPRRAQPKMYTRTKKIFVCG