- [x] Sign up for HuggingFace (we will be using PepMLM: https://huggingface.co/ChatterjeeLab/PepMLM-650M)
- Once you login, go to the page (https://huggingface.co/settings/tokens). Click
+Create new token.
- Make sure you type the full name
ChatterjeeLab/PepMLM-650M when searching for repos. Click save token and you will see the newly token (copy that).
- Go to the page (https://huggingface.co/ChatterjeeLab/PepMLM-650M) and find their Colab Notebook (link).
- Make a copy to your Google Drive, choose T4 GPU and run each block.
- When running into the block
Input HF token , a pop-up will show Enter your token (input will not be visible):. Paste your token and Add token as git credential? (Y/n) choose n.


- Find the amino acid sequence for SOD1 in UniProt (ID: P00441), a protein when mutated, can cause Amyotrophic lateral sclerosis (ALS). In fact, the A4V (when you change position 4 from Alanine to Valine) causes the most aggressive form of ALS, so make that change in the sequence
MATKAVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTS AGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVV HEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ:WHYYAYAVAWKE,KRYYAAALRWKK, FLYRWLPSRRGG
- Enter your mutated SOD1 sequence into the PepMLM inference API and generate 4 peptides of length 12 amino acids (Step 5 takes a while so you can also just pick 1 or 2 peptides)
WHYYAYAVAWKE
KRYYAAALRWKK
FLYRWLPSRRGG
- Go to AlphaFold-Multimer (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). This is similar to what you did for homework last week but instead for a protein-peptide complex
- Set model_type: alphafold2_multimer_v3 (this model has been shown to recapitulate peptide-protein binding accurately: https://www.frontiersin.org/articles/10.3389/fbinf.2022.959160/full). * Add your query sequence - Its the SOD1Sequence:PeptideSequence.






